US20130282428A1 - Probabilistic inference of site demographics from aggregate user internet usage and source demographic information - Google Patents

Probabilistic inference of site demographics from aggregate user internet usage and source demographic information Download PDF

Info

Publication number
US20130282428A1
US20130282428A1 US13/651,763 US201213651763A US2013282428A1 US 20130282428 A1 US20130282428 A1 US 20130282428A1 US 201213651763 A US201213651763 A US 201213651763A US 2013282428 A1 US2013282428 A1 US 2013282428A1
Authority
US
United States
Prior art keywords
demographic attribute
users
online
value
estimating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/651,763
Inventor
Ching Law
Gokul Rajaram
Rama Ranganath
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority to US13/651,763 priority Critical patent/US20130282428A1/en
Publication of US20130282428A1 publication Critical patent/US20130282428A1/en
Assigned to GOOGLE INC. reassignment GOOGLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LAW, CHING, RAJARAM, GOKUL, RANGANATH, RAMA
Assigned to GOOGLE LLC reassignment GOOGLE LLC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: GOOGLE INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0204Market segmentation

Definitions

  • the present invention concerns determining demographic information.
  • the present invention concerns probabilistically determining demographic information for a domain, such as a Website for example.
  • Demographic targeting is an important mode of targeting used by advertisers.
  • demographic information is typically only available for large Websites on the Internet. This is likely because the third parties that supply demographic information do so using a panel of 50,000-100,000 users. Consequently, these third parties can only get statistically significant user data for large Websites. This means that there is no way for these third parties to infer the user demographics for the vast majority of Websites on the Internet. This is unfortunate, because having reliable Internet-wide demographics, would enable more advertising revenue to become available to smaller Websites, instead of just the large ones for which demographics are known.
  • Websites could self-describe their demographics. However, advertisers would probably not trust data supplied directly by the Website owner. For example, Website owners have an incentive to say “My visitors are all spendthrift millionaires”, whether or not this is true, in order to attract high-revenue advertisements.
  • Embodiments consistent with the present invention may be used to determine a demographic attribute value of a sink online document given a set of users each of whom visited at least one of the source documents and the sink document. At least some of these embodiments may do so by (a) accepting a set of one or more values of the demographic attribute, each of the one or more demographic attribute values being associated with a source online document, wherein each of the source online documents has a value for the demographic attribute and has been visited by at least one user of the given set, (b) determining an estimate of the demographic attribute value of each of the users of the given set using the accepted demographic attribute value of each of the source online documents visited by the user, and (c) determining the demographic attribute value of the sink online document using the determined estimate of the demographic attribute value of each of the users of the given set.
  • the documents are Web pages, or Websites.
  • FIG. 1 is a bubble diagram illustrating various operations that may be performed, and various information that may be used and/or generated, by exemplary embodiments consistent with the present invention.
  • FIG. 2 is a flow diagram of an exemplary method for performing the general operations for estimating demographic information of a Website in a manner consistent with the present invention.
  • FIG. 3 is a flow diagram of an exemplary method for estimating demographic information of a Website in a manner consistent with the present invention.
  • FIG. 4 is a block diagram of an exemplary apparatus that may perform various operations, and store information used and/or generated by such operations, in a manner consistent with the present invention.
  • the present invention may involve novel methods, apparatus, message formats, and/or data structures for determining demographic information of a Website by using a set of source Websites with known demographic information and a given set of users each of whom visited at least one of the source Websites and the Website.
  • the following description is presented to enable one skilled in the art to make and use the invention, and is provided in the context of particular applications and their requirements.
  • the following description of embodiments consistent with the present invention provides illustration and description, but is not intended to be exhaustive or to limit the present invention to the precise form disclosed.
  • Various modifications to the disclosed embodiments will be apparent to those skilled in the art, and the general principles set forth below may be applied to other embodiments and applications.
  • FIG. 1 is a bubble diagram illustrating various operations that may be performed, and various information that may be used and/or generated, by exemplary embodiments consistent with the present invention.
  • demographic information of source online documents (seed websites) 110 may be available to the user demographic information estimation operation 150 .
  • the 150 operations may obtain user information from user's 130 client device (e.g., browser toolbar). Such user information may be used to draw a given set of users, each of whom visited at least one of the source online documents and sink online documents.
  • Such user information may be generated by tracking users moving across various Websites (both source (seed) Websites 110 and sink (non-seed) Websites 120 ) with the help of browser toolbar.
  • the operations 150 may estimate user demographic information for all users in the given set of users defined above.
  • the estimated demographic information of each user in the given set generated by the operations 150 may be provided to the demographic information estimation operations 160 .
  • the operations 160 may use the estimated demographic information of each user within the given set to determine estimated demographic information 170 of sink online documents 120 .
  • FIG. 2 is a flow diagram of an exemplary method 200 that might be used to probabilistically estimate demographic information of a domain or Website in a manner consistent with the present invention.
  • the method 200 may accept exact demographic information from a set of source online documents (e.g., seed Websites). (Block 210 ) Thereafter, the method 200 may probabilistically estimate demographic information of sink online documents (e.g., non-seed Websites) by using demographic information of source online documents and the pair-wise relationship between the documents (both sink and source online documents). (Block 220 )
  • source online documents e.g., seed Websites
  • sink online documents e.g., non-seed Websites
  • the method 200 might probabilistically estimate demographic information as follows.
  • d be a demographics attribute, which is a function a set of Websites to a probability.
  • d(s) is considered as the minimum probability that a pageview on Website s would satisfy this demographics attribute (i.e., that the pageview would be by a user with the demographic attribute).
  • d is the attribute “age 25-34”
  • p be a function on set of edges of the graph G, where nodes of the graph G represent domains (e.g., Websites) or Web pages.
  • p(a,b) represent the probability that a pageview at Website b is initiated by a visitor of Website a.
  • Some embodiments consistent with the present invention might use a damping factor ⁇ (0,1) to express how dependent or independent the traffic is of the demographics property. Specifically, if the traffic data is independent of the demographics property, then ⁇ would be 1 (1 means no damping factor at all, which is the case when the traffic data is independent of demographics). Otherwise ⁇ would be a factor less than 1 indicating some preservation of demographics property in the traffic flow.
  • a reasonable value for a can be derived by observing the demographics of source Websites for which there is traffic data. For example, if only users of a certain demographics property move from Website A to Website B, and if users without this property would move to Website C, then ⁇ might be set close to zero for this particular property.
  • a lower-bound estimate of the demographics d on t as contributed by s can be determined as follows:
  • e ⁇ ( t ) ⁇ ⁇ ⁇ S ⁇ G ⁇ p ⁇ ( s , t ) ⁇ d ⁇ ( s ) .
  • the value of the demographic attribute for the Website t may be estimated as the average value of d (u) for all visitors u ⁇ U of Website t:
  • the users demographics approach can work with either pageviews or unique users.
  • the above formula estimates the demographics of a random visitor of Website t. If frequency estimates of the visitors of Website t are also available, then the demographics of a random pageview at the Website t can also be estimated.
  • the demographics of the Website s can be estimated with either of foregoing techniques.
  • the estimate may then be compared with the given (actual) value d (s).
  • the estimates should not exceed the provided d (s) values for most of the Websites in S.
  • FIG. 3 is a flow diagram of an exemplary method 300 that may be used to estimate demographic information of a document (referred to as a “Sink online document”) such as a domain or Website for example, in a manner consistent with the present invention.
  • the method 300 may accept a set of one or more values of the demographic attribute, each of which being associated with a source online document, wherein each of the source online documents has a value for the demographic attribute and has been visited by at least one user of a set of users who have also visited a sink document.
  • the method 300 may determine an estimate of the demographic attribute value of each of the users of the given set using the accepted demographic attribute value of each of the source online documents visited by the user.
  • the method 300 may determine the demographic attribute value of the sink online document using the determined estimate of the demographic attribute value of each of the users of the given set.
  • Block 330
  • the given set of users might be users who have visited both at least one of the source documents and the sink document.
  • This set of users can be derived from browser toolbars which can track Websites visited by users.
  • the method 300 might determine an estimate of the demographic attribute value of each of the users in the given set by (i) summing, over all the source online documents visited by the user, the corresponding demographic attribute value of the source online documents to generate a summing result, and (ii) dividing the summing result with the number of source online documents visited by the user.
  • the method 300 may determine the demographic attribute value of the sink online document by (i) summing, over all the users of the given set, the corresponding determined estimate of the demographic attribute value of each of the users to generate a summing result, and (ii) dividing the summing result with the number of users of the given set.
  • FIG. 4 is high-level block diagram of a machine 400 that may perform one or more of the operations discussed above.
  • the machine 400 basically includes one or more processors 410 , one or more input/output interface units 430 , one or more storage devices 420 , and one or more system buses and/or networks 440 for facilitating the communication of information among the coupled elements.
  • One or more input devices 432 and one or more output devices 434 may be coupled with the one or more input/output interfaces 430 .
  • the machine 400 may be, for example, an advertising server, or it may be a plurality of servers distributed over a network.
  • the one or more processors 410 may execute machine-executable instructions (e.g., C or C++ running on the Solaris operating system available from Sun Microsystems Inc. of Palo Alto, Calif. or the Linux operating system widely available from a number of vendors such as Red Hat, Inc. of Durham, N.C.) to effect one or more aspects of the present invention. At least a portion of the machine executable instructions may be stored (temporarily or more permanently) on the one or more storage devices 420 and/or may be received from an external source via one or more input interface units 430 .
  • the machine-executable instructions might be stored as modules (e.g., corresponding to the above-described operations).
  • the machine 400 may be one or more conventional personal computers.
  • the processing units 410 may be one or more microprocessors.
  • the bus 440 may include a system bus.
  • the storage devices 420 may include system memory, such as read only memory (ROM) and/or random access memory (RAM).
  • the storage devices 420 may also include a hard disk drive for reading from and writing to a hard disk, a magnetic disk drive for reading from or writing to a (e.g., removable) magnetic disk, and an optical disk drive for reading from or writing to a removable (magneto-) optical disk such as a compact disk or other (magneto-) optical media.
  • a user may enter commands and information into the personal computer through input devices 432 , such as a keyboard and pointing device (e.g., a mouse) for example.
  • Other input devices such as a microphone, a joystick, a game pad, a satellite dish, a scanner, or the like, may also (or alternatively) be included.
  • These and other input devices are often connected to the processing unit(s) 410 through an appropriate interface 430 coupled to the system bus 440 .
  • the output devices 434 may include a monitor or other type of display device, which may also be connected to the system bus 440 via an appropriate interface.
  • the personal computer may include other (peripheral) output devices (not shown), such as speakers and printers for example.
  • the online documents might be documents served by server computers.
  • the users 130 might access the online documents using a client device, such as a personal computer, a mobile telephone, a mobile device, etc., having a browser.
  • the operations 150 and 160 might be performed by one or more computers.
  • the source demographic attribute information might be exact or non-exact demographic information of a small set of large Websites. This information might be collected from the Internet surfing behavior of opted-in panelists (e.g., 50,000-100,000 in number) whose exact demographics are known. For each Website in this list, the information supplied might include one or more of the following demographic information: Age, Gender, Household Income, Education, # Children (Household size), Connection speed, etc. Thus, this data might be used as “seed” data.
  • the surfing behavior of an extremely large number (e.g., millions) of users might be analyzed to compute user traffic inflows and outflows for every Website.
  • This data might be obtained from client software (e.g., a browser toolbar) installed on users' computers.
  • Website S 2 Website S 3 Age (20-35) 80% 60% Age (36-60) 20% 40% Gender M 85% 75% Gender F 15% 25% Household income ($70K-$100K) 10% 45%
  • this approach estimates the demographic properties of each single user and subsequently determines the demographic properties of each non-seed Website.
  • the first step in this approach is to estimate the demographic property d(u) of user u.
  • a simple approach is to take the average of d(S) for all seed Websites visited by u which can be represented by the following equation:
  • the above results are estimates of the demographic property that “users are male” of each user.
  • the next step is to take the average value of d(u) (or e(u)) for each non-seed Website “t” for all users that visited site “t”:
  • embodiments consistent with the present invention may be used to provide useful estimates of demographic information for domains, such as Websites for example.

Abstract

A demographic attribute value of a sink online document (such as Websites or Web pages) may be determined given a set of users who have visited at least one of the source documents and the sink document, by (a) accepting a value(s) of the demographic attribute, each of which values is associated with a source online document (where each of the source online documents has a value for the demographic attribute and has been visited by at least one user of the given set), (b) determining an estimate of the demographic attribute value of each of the users of the given set using the accepted demographic attribute value of each of the source online documents visited by the user, and (c) determining the demographic attribute value of the sink online document using the determined estimate of the demographic attribute value of each of the users of the given set.

Description

    §0. RELATED APPLICATION(S)
  • This application is a continuation of U.S. patent application Ser. No. 11/699,745 (incorporated herein by reference), titled “PROBABILISTIC INFERENCE OF SITE DEMOGRAPHICS FROM AGGREGATE USER INTERNET USAGE AND SOURCE DEMOGRAPHIC INFORMATION,” filed on Jan. 30, 2007 and listing Ching LAW, Gokul RAJARAM, and Rama RANGANATH as inventors.
  • §1. BACKGROUND OF THE INVENTION
  • §1.1 Field of the Invention
  • The present invention concerns determining demographic information. In particular, the present invention concerns probabilistically determining demographic information for a domain, such as a Website for example.
  • §1.2 Background Information
  • Demographic targeting is an important mode of targeting used by advertisers. Currently, demographic information is typically only available for large Websites on the Internet. This is likely because the third parties that supply demographic information do so using a panel of 50,000-100,000 users. Consequently, these third parties can only get statistically significant user data for large Websites. This means that there is no way for these third parties to infer the user demographics for the vast majority of Websites on the Internet. This is unfortunate, because having reliable Internet-wide demographics, would enable more advertising revenue to become available to smaller Websites, instead of just the large ones for which demographics are known.
  • Naturally, small Websites could self-describe their demographics. However, advertisers would probably not trust data supplied directly by the Website owner. For example, Website owners have an incentive to say “My visitors are all spendthrift millionaires”, whether or not this is true, in order to attract high-revenue advertisements.
  • §2. SUMMARY OF THE INVENTION
  • Embodiments consistent with the present invention may be used to determine a demographic attribute value of a sink online document given a set of users each of whom visited at least one of the source documents and the sink document. At least some of these embodiments may do so by (a) accepting a set of one or more values of the demographic attribute, each of the one or more demographic attribute values being associated with a source online document, wherein each of the source online documents has a value for the demographic attribute and has been visited by at least one user of the given set, (b) determining an estimate of the demographic attribute value of each of the users of the given set using the accepted demographic attribute value of each of the source online documents visited by the user, and (c) determining the demographic attribute value of the sink online document using the determined estimate of the demographic attribute value of each of the users of the given set.
  • In at least some embodiments consistent with the present invention, the documents are Web pages, or Websites.
  • §3. BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a bubble diagram illustrating various operations that may be performed, and various information that may be used and/or generated, by exemplary embodiments consistent with the present invention.
  • FIG. 2 is a flow diagram of an exemplary method for performing the general operations for estimating demographic information of a Website in a manner consistent with the present invention.
  • FIG. 3 is a flow diagram of an exemplary method for estimating demographic information of a Website in a manner consistent with the present invention.
  • FIG. 4 is a block diagram of an exemplary apparatus that may perform various operations, and store information used and/or generated by such operations, in a manner consistent with the present invention.
  • §4. DETAILED DESCRIPTION
  • The present invention may involve novel methods, apparatus, message formats, and/or data structures for determining demographic information of a Website by using a set of source Websites with known demographic information and a given set of users each of whom visited at least one of the source Websites and the Website. The following description is presented to enable one skilled in the art to make and use the invention, and is provided in the context of particular applications and their requirements. Thus, the following description of embodiments consistent with the present invention provides illustration and description, but is not intended to be exhaustive or to limit the present invention to the precise form disclosed. Various modifications to the disclosed embodiments will be apparent to those skilled in the art, and the general principles set forth below may be applied to other embodiments and applications. For example, although a series of acts may be described with reference to a flow diagram, the order of acts may differ in other implementations when the performance of one act is not dependent on the completion of another act. Further, non-dependent acts may be performed in parallel. No element, act or instruction used in the description should be construed as critical or essential to the present invention unless explicitly described as such. Also, as used herein, the article “a” is intended to include one or more items. Where only one item is intended, the term “one” or similar language is used. In the following, “information” may refer to the actual information, or a pointer to, identifier of, or location of such information. Thus, the present invention is not intended to be limited to the embodiments shown and the inventors regard their invention to include any patentable subject matter described.
  • In the following, exemplary environments in which, or with which, exemplary embodiments consistent with the present invention may operate, are described in §4.1. Then, exemplary embodiments consistent with the present invention are described in §4.2. Some illustrative examples of exemplary operations of exemplary embodiments consistent with the present invention are provided in §4.3. Finally, some conclusions regarding the present invention are set forth in §4.4.
  • §4.1 Exemplary Environment in which, or with which, Exemplary Embodiments Consistent with the Present Invention May Operate
  • FIG. 1 is a bubble diagram illustrating various operations that may be performed, and various information that may be used and/or generated, by exemplary embodiments consistent with the present invention. In particular, demographic information of source online documents (seed websites) 110 may be available to the user demographic information estimation operation 150. Further, the 150 operations may obtain user information from user's 130 client device (e.g., browser toolbar). Such user information may be used to draw a given set of users, each of whom visited at least one of the source online documents and sink online documents. Such user information may be generated by tracking users moving across various Websites (both source (seed) Websites 110 and sink (non-seed) Websites 120) with the help of browser toolbar. Using such a given set of users and exact demographic information of source Websites 110, the operations 150 may estimate user demographic information for all users in the given set of users defined above. The estimated demographic information of each user in the given set generated by the operations 150 may be provided to the demographic information estimation operations 160. The operations 160 may use the estimated demographic information of each user within the given set to determine estimated demographic information 170 of sink online documents 120.
  • Various exemplary embodiments of the present invention are now described in §4.2.
  • §4.2 Exemplary Embodiments
  • FIG. 2 is a flow diagram of an exemplary method 200 that might be used to probabilistically estimate demographic information of a domain or Website in a manner consistent with the present invention. In particular, the method 200 may accept exact demographic information from a set of source online documents (e.g., seed Websites). (Block 210) Thereafter, the method 200 may probabilistically estimate demographic information of sink online documents (e.g., non-seed Websites) by using demographic information of source online documents and the pair-wise relationship between the documents (both sink and source online documents). (Block 220)
  • Referring back to block 220, the method 200 might probabilistically estimate demographic information as follows. Let d be a demographics attribute, which is a function a set of Websites to a probability. Thus d(s)ε[0,1] for any Website s. In particular, d(s) is considered as the minimum probability that a pageview on Website s would satisfy this demographics attribute (i.e., that the pageview would be by a user with the demographic attribute). For example, if d is the attribute “age 25-34”, then d(site.com)=0.5 means that a pageview on site.com has a minimum probability of 0.5 of being generated from a visitor of age 25-34.
  • Assume that the function d is only known for a set of source Websites S which is a subset of the universe of all Websites G. Embodiments consistent with the present invention might be used to estimate the values of d on other Websites.
  • In the following, two alternative approaches for estimating the demographics function d—Upstream/Downstream Traffic, and Users Demographics—are described.
  • §4.2.1 Upstream/Downstream Traffic Approach
  • In the Upstream/Downstream Traffic approach, pair-wise relations between Websites are examined by tracking the users who move across the Websites during their browsing sessions.
  • Let p be a function on set of edges of the graph G, where nodes of the graph G represent domains (e.g., Websites) or Web pages. For any two Websites a and b, let p(a,b) represent the probability that a pageview at Website b is initiated by a visitor of Website a. Function p can be derived from information tracking users who have visited Website a and/or Website b. Such information may be recorded in toolbar traffics logs. For example, if p(aa.com,bb.com)=0.1, then a pageview on bb.com has the probability of 0.1 that it is generated by a visitor of site aa.com.
  • Some embodiments consistent with the present invention might use a damping factor αε(0,1) to express how dependent or independent the traffic is of the demographics property. Specifically, if the traffic data is independent of the demographics property, then α would be 1 (1 means no damping factor at all, which is the case when the traffic data is independent of demographics). Otherwise α would be a factor less than 1 indicating some preservation of demographics property in the traffic flow. A reasonable value for a can be derived by observing the demographics of source Websites for which there is traffic data. For example, if only users of a certain demographics property move from Website A to Website B, and if users without this property would move to Website C, then α might be set close to zero for this particular property.
  • For each site t≠s, a lower-bound estimate of the demographics d on t as contributed by s can be determined as follows:

  • p(s,td(s)×α
  • Repeating this calculation for all pairs s, t an estimate of e(t) can be expressed as:
  • e ( t ) = α S G p ( s , t ) d ( s ) .
  • This can be repeated, using e as function d in the next iteration (e.g., until the estimate is not further improved).
  • One potential disadvantage of the upstream/downstream traffic approach is that it might depend on the direct clicks between Websites to infer the demographics information. However, a Website's demographics from all the upstream and downstream traffic could deviate from its overall demographics. Notwithstanding such a potential deviation, if it can be assumed that such click traffic should be mostly independent of the overall demographics, then this approach should provide useful estimates.
  • §4.2.2 Users Demographics Approach
  • In the users demographics approach, demographics information of a user is inferred from the Websites that they visit (e.g., using client device browser toolbar information). For example, if a user u visits a Website s with d(s)=0.7, then a value 0.7 can be assigned to d(u). If u visits two independent Websites a and b, d(u) can be estimated to be (1−d(a))(1−d(b)). However, in general, it is not easy to show that the demographics of two Websites are independent. Further, given the fact that u visits both Websites, they cannot be assumed to be totally independent.
  • A simpler approach is to take the average of d (s) for all Websites sεS visited by u. Let v be the visiting function where v(u,s)=1 if user u visits Website s, and v(u,s)=0 otherwise. Thus, all Websites visited by u may be expressed as Su={uεS|v(u,s)=1}. The estimated value of the demographic for the user can be expressed as:
  • e ( u ) = s S u d ( s ) S u .
  • This would be an estimation of the demographics of user u. Then, for any Website t not in S (Note that S is the set of all Websites for which there is demographics information from external source, and it is desired to estimate the demographics function of Websites not in S.), the value of the demographic attribute for the Website t may be estimated as the average value of d (u) for all visitors uεU of Website t:
  • e ( t ) = u U t e ( u ) U t ,
  • where Ut={uεU|v(u,t)=1}.
  • Thus, the users demographics approach can work with either pageviews or unique users. The above formula estimates the demographics of a random visitor of Website t. If frequency estimates of the visitors of Website t are also available, then the demographics of a random pageview at the Website t can also be estimated.
  • §4.2.3 Evaluation
  • To evaluate the either of the foregoing approaches, given a Website s in the source set S, the demographics of the Website s can be estimated with either of foregoing techniques. The estimate may then be compared with the given (actual) value d (s). In some conservative embodiments consistent with the present invention, the estimates should not exceed the provided d (s) values for most of the Websites in S.
  • §4.2.4 Exemplary Methods
  • FIG. 3 is a flow diagram of an exemplary method 300 that may be used to estimate demographic information of a document (referred to as a “Sink online document”) such as a domain or Website for example, in a manner consistent with the present invention. In particular, the method 300 may accept a set of one or more values of the demographic attribute, each of which being associated with a source online document, wherein each of the source online documents has a value for the demographic attribute and has been visited by at least one user of a set of users who have also visited a sink document. (Block 310) The method 300 may determine an estimate of the demographic attribute value of each of the users of the given set using the accepted demographic attribute value of each of the source online documents visited by the user. (Block 320) Finally, the method 300 may determine the demographic attribute value of the sink online document using the determined estimate of the demographic attribute value of each of the users of the given set. (Block 330)
  • Referring back to block 310, the given set of users might be users who have visited both at least one of the source documents and the sink document. This set of users can be derived from browser toolbars which can track Websites visited by users.
  • Referring back to block 320, the method 300 might determine an estimate of the demographic attribute value of each of the users in the given set by (i) summing, over all the source online documents visited by the user, the corresponding demographic attribute value of the source online documents to generate a summing result, and (ii) dividing the summing result with the number of source online documents visited by the user.
  • Referring back to block 330, the method 300 may determine the demographic attribute value of the sink online document by (i) summing, over all the users of the given set, the corresponding determined estimate of the demographic attribute value of each of the users to generate a summing result, and (ii) dividing the summing result with the number of users of the given set.
  • §4.2.5 Exemplary Apparatus
  • FIG. 4 is high-level block diagram of a machine 400 that may perform one or more of the operations discussed above. The machine 400 basically includes one or more processors 410, one or more input/output interface units 430, one or more storage devices 420, and one or more system buses and/or networks 440 for facilitating the communication of information among the coupled elements. One or more input devices 432 and one or more output devices 434 may be coupled with the one or more input/output interfaces 430. The machine 400 may be, for example, an advertising server, or it may be a plurality of servers distributed over a network.
  • The one or more processors 410 may execute machine-executable instructions (e.g., C or C++ running on the Solaris operating system available from Sun Microsystems Inc. of Palo Alto, Calif. or the Linux operating system widely available from a number of vendors such as Red Hat, Inc. of Durham, N.C.) to effect one or more aspects of the present invention. At least a portion of the machine executable instructions may be stored (temporarily or more permanently) on the one or more storage devices 420 and/or may be received from an external source via one or more input interface units 430. The machine-executable instructions might be stored as modules (e.g., corresponding to the above-described operations).
  • In one embodiment consistent with the present invention, the machine 400 may be one or more conventional personal computers. In this case, the processing units 410 may be one or more microprocessors. The bus 440 may include a system bus. The storage devices 420 may include system memory, such as read only memory (ROM) and/or random access memory (RAM). The storage devices 420 may also include a hard disk drive for reading from and writing to a hard disk, a magnetic disk drive for reading from or writing to a (e.g., removable) magnetic disk, and an optical disk drive for reading from or writing to a removable (magneto-) optical disk such as a compact disk or other (magneto-) optical media.
  • A user may enter commands and information into the personal computer through input devices 432, such as a keyboard and pointing device (e.g., a mouse) for example. Other input devices such as a microphone, a joystick, a game pad, a satellite dish, a scanner, or the like, may also (or alternatively) be included. These and other input devices are often connected to the processing unit(s) 410 through an appropriate interface 430 coupled to the system bus 440. The output devices 434 may include a monitor or other type of display device, which may also be connected to the system bus 440 via an appropriate interface. In addition to (or instead of) the monitor, the personal computer may include other (peripheral) output devices (not shown), such as speakers and printers for example.
  • Referring back to claim 1, the online documents might be documents served by server computers. The users 130 might access the online documents using a client device, such as a personal computer, a mobile telephone, a mobile device, etc., having a browser. The operations 150 and 160 might be performed by one or more computers.
  • §4.2.6 Refinements and Alternatives
  • The source demographic attribute information might be exact or non-exact demographic information of a small set of large Websites. This information might be collected from the Internet surfing behavior of opted-in panelists (e.g., 50,000-100,000 in number) whose exact demographics are known. For each Website in this list, the information supplied might include one or more of the following demographic information: Age, Gender, Household Income, Education, # Children (Household size), Connection speed, etc. Thus, this data might be used as “seed” data.
  • The surfing behavior of an extremely large number (e.g., millions) of users might be analyzed to compute user traffic inflows and outflows for every Website. This data might be obtained from client software (e.g., a browser toolbar) installed on users' computers.
  • Although some of the exemplary embodiments were discussed in the context of Websites, embodiments consistent with the present invention might be used to infer demographic information in other contexts such as, for example, domains, Web pages, documents, etc.
  • §4.3 Examples of Operations
  • To illustrate the above operations of an exemplary method, a simplified example is presented. Assume that the universe of all Websites is G={S1, S2, S3, S4} and the seed Websites which is a subset of G is the following: S={S2, S3}. Demographic information for the seed Websites is known. The following is a sample of some demographic information for the two seed Websites S2 and S3.
  • Demographic property Website S2 Website S3
    Age (20-35) 80% 60%
    Age (36-60) 20% 40%
    Gender M 85% 75%
    Gender F 15% 25%
    Household income ($70K-$100K) 10% 45%
  • Assume that d(S) is the demographic property that “users are male”. Then, it is known from the table above that d(S2)=0.85 and d(S3)=0.75. The objective is to probabilistically estimate d(S1) and d(S4).
  • In particular, this approach estimates the demographic properties of each single user and subsequently determines the demographic properties of each non-seed Website.
  • Again, assume that the universe of all Websites is G={S1,S2,S3,S4} and the seed Websites which is a subset of G is S={S2,S3}. Demographic information for the seed Websites are as described above. Further, it is assumed that the universe of all users is U={u1,u2,u3,u4,u5}.
  • The first step in this approach is to estimate the demographic property d(u) of user u. A simple approach is to take the average of d(S) for all seed Websites visited by u which can be represented by the following equation:
  • e ( u ) = s S u d ( s ) S u .
  • Su is the set of seed Websites visited by user u. Assume that the set of seed Websites visited by each user in the user set U={u1,u2,u3,u4,u5} are the following:
  • Su 1 ={S2,S3}, Su 2 ={S3}, Su 3 ={S2,S3}, Su 4 ={S2}, Su 5 ={S2}. Now the above equation may be used to estimate the average demographic property for each user:
  • For user u1:
  • e ( u 1 ) = d ( S 2 ) + d ( S 3 ) S u 1 = 0.85 + 0.75 2 = 0.80
  • For user u2:
  • e ( u 2 ) = d ( S 3 ) S u 2 = 0.75 1 = 0.75
  • For user u3:
  • e ( u 3 ) = d ( S 2 ) + d ( S 3 ) S u 3 = 0.85 + 0.75 2 = 0.80
  • For user u4:
  • e ( u 4 ) = d ( S 2 ) S u 4 = 0.85 1 = 0.85
  • For user u5:
  • e ( u 5 ) = d ( S 2 ) S u 5 = 0.85 1 = 0.85
  • The above results are estimates of the demographic property that “users are male” of each user. The next step is to take the average value of d(u) (or e(u)) for each non-seed Website “t” for all users that visited site “t”:
  • e ( u ) = u U t e ( u ) U t .
  • Ut is the set of users that visited non-seed site “t”. Assume that the set of user that visited each of the non-seed Website are the following: US 1 ={u1,u2,u3,u5}, US 2 ={u1,u3,u4,u5}, US 3 ={u1,u2,u3}, US 4 ={u1,u2}. Now the above equation may be used to estimate the demographic property of “users are male” for every non-seed Website:
  • For Website S1:
  • e ( S 1 ) = u U s 1 e ( u ) U s 1 = e ( u 1 ) + e ( u 2 ) + e ( u 3 ) + e ( u 5 ) 4 = 0.80 + 0.75 + 0.80 + 0.85 4 = 0.80
  • For Website S4:
  • e ( S 4 ) = u U s 4 e ( u ) U s 4 = e ( u 1 ) + e ( u 2 ) 2 = 0.80 + 0.75 2 = 0.78
  • From the above the final results are:

  • d(S 1)=0.80, d(S 2)=0.85, d(S 3)=0.75, and d(S 4)=0.78
  • As a result, it has now been estimated probabilistically that 80% of users visiting Website S1 are male and 78% of users visiting Website S4 are male.
  • It is possible to probabilistically estimate any other demographic property (e.g., Ages 20-35, Age 36-60, Household Income $70 k-$100 k, etc.) for Websites S1 and S4 in a similar manner.
  • §4.4 Conclusions
  • As can be appreciated from the foregoing, embodiments consistent with the present invention may be used to provide useful estimates of demographic information for domains, such as Websites for example.

Claims (21)

What is claimed is:
1. A computer-implemented method for estimating a value of a demographic attribute for a first online document, the method comprising:
a) receiving, by a computer system including at least one computer, a set of one or more known values of the demographic attribute associated with each of a plurality of second online documents that have been visited by a set of independent users who have also visited the first online document;
b) estimating, by the computer system, the value of the demographic attribute for the first online document using the accepted known values of the demographic attribute of each of the plurality of second online documents; and
c) storing, by the computer system, the estimated value of the demographic attribute for the first online document in association with the first online document.
2. The computer-implemented method of claim 1 wherein the act of estimating the value of the demographic attribute for the first online document using the accepted known values of the demographic attribute of each of the plurality of second online documents includes
i) estimating, by the computer system, a value of the demographic attribute for each of the independent users of the set using the accepted known values of the demographic attribute of each of the plurality of second online documents, and
ii) estimating, by the computer system, the value of the demographic attribute for the first online document using the estimated values of the demographic attribute estimated for the independent users of the set.
3. The computer-implemented method of claim 2 wherein the act of estimating a value of the demographic attribute for each of the independent users of the set using the accepted known values of the demographic attribute of each of the plurality of second online documents includes
i) summing over all the second online documents visited by the user, the corresponding value of the demographic attribute of the second online documents to generate a summing result, and
ii) dividing the summing result with the number of second online documents visited by the user.
4. The computer-implemented method of claim 3 wherein the act estimating the value of the demographic attribute of the first online document includes
i) summing over all the users of the given set, the corresponding estimated value of the demographic attribute of each of the users to generate a summing result, and
ii) dividing the summing result with the number of users in the set.
5. The computer-implemented method of claim 1 wherein the act of estimating the value of the demographic attribute for the first online document includes
i) summing over all the users of the set, the corresponding estimated value of the demographic attribute of each of the users to generate a summing result, and
ii) dividing the summing result with the number of users in the set.
6. The computer-implemented method of claim 1 wherein the first and second online documents are Web pages.
7. The computer-implemented method of claim 1 wherein the first and second online documents are Websites.
8. Apparatus comprising:
a) at least one processor; and
b) at least one storage device storing processor executable instructions which, when executed by the at least one processor, cause the at least one processor to perform a method for estimating a value of a demographic attribute for a first online document, the method including
1) receiving, by a computer system including at least one computer, a set of one or more known values of the demographic attribute associated with each of a plurality of second online documents that have been visited by a set of independent users who have also visited the first online document;
2) estimating, by the computer system, the value of the demographic attribute for the first online document using the accepted known values of the demographic attribute of each of the plurality of second online documents;
and
3) storing, by the computer system, the estimated value of the demographic attribute for the first online document in association with the first online document.
9. The apparatus of claim 8 wherein the act of estimating the value of the demographic attribute for the first online document using the accepted known values of the demographic attribute of each of the plurality of second online documents includes
A) estimating, by the computer system, a value of the demographic attribute for each of the independent users of the set using the accepted known values of the demographic attribute of each of the plurality of second online documents, and
B) estimating, by the computer system, the value of the demographic attribute for the first online document using the estimated values of the demographic attribute estimated for the independent users of the set.
10. The apparatus of claim 9 wherein the act of estimating a value of the demographic attribute for each of the independent users of the set using the accepted known values of the demographic attribute of each of the plurality of second online documents includes
A) summing over all the second online documents visited by the user, the corresponding value of the demographic attribute of the second online documents to generate a summing result, and
B) dividing the summing result with the number of second online documents visited by the user.
11. The apparatus of claim 10 wherein the act estimating the value of the demographic attribute of the first online document includes
A) summing over all the users of the given set, the corresponding estimated value of the demographic attribute of each of the users to generate a summing result, and
B) dividing the summing result with the number of users in the set.
12. The apparatus of claim 8 wherein the act of estimating the value of the demographic attribute for the first online document includes
A) summing over all the users of the set, the corresponding estimated value of the demographic attribute of each of the users to generate a summing result, and
B) dividing the summing result with the number of users in the set.
13. The apparatus of claim 8 wherein the first and second online documents are Web pages.
14. The apparatus of claim 8 wherein the first and second online documents are Websites.
15. A non-transitory storage medium storing processor executable instructions which, when executed by at least one processor, cause the at least one processor to perform a method for estimating a value of a demographic attribute for a first online document, the method including:
a) receiving, by a computer system including at least one computer, a set of one or more known values of the demographic attribute associated with each of a plurality of second online documents that have been visited by a set of independent users who have also visited the first online document;
b) estimating, by the computer system, the value of the demographic attribute for the first online document using the accepted known values of the demographic attribute of each of the plurality of second online documents;
and
c) storing, by the computer system, the estimated value of the demographic attribute for the first online document in association with the first online document.
16. The non-transitory storage medium of claim 15 wherein the act of estimating the value of the demographic attribute for the first online document using the accepted known values of the demographic attribute of each of the plurality of second online documents includes
i) estimating, by the computer system, a value of the demographic attribute for each of the independent users of the set using the accepted known values of the demographic attribute of each of the plurality of second online documents, and
ii) estimating, by the computer system, the value of the demographic attribute for the first online document using the estimated values of the demographic attribute estimated for the independent users of the set.
17. The non-transitory storage medium of claim 16 wherein the act of estimating a value of the demographic attribute for each of the independent users of the set using the accepted known values of the demographic attribute of each of the plurality of second online documents includes
i) summing over all the second online documents visited by the user, the corresponding value of the demographic attribute of the second online documents to generate a summing result, and
ii) dividing the summing result with the number of second online documents visited by the user.
18. The non-transitory storage medium of claim 17 wherein the act estimating the value of the demographic attribute of the first online document includes
i) summing over all the users of the given set, the corresponding estimated value of the demographic attribute of each of the users to generate a summing result, and
ii) dividing the summing result with the number of users in the set.
19. The non-transitory storage medium of claim 15 wherein the act of estimating the value of the demographic attribute for the first online document includes
i) summing over all the users of the set, the corresponding estimated value of the demographic attribute of each of the users to generate a summing result, and
ii) dividing the summing result with the number of users in the set.
20. The apparatus of claim 15 wherein the first and second online documents are Web pages.
21. The apparatus of claim 15 wherein the first and second online documents are Websites.
US13/651,763 2007-01-30 2012-10-15 Probabilistic inference of site demographics from aggregate user internet usage and source demographic information Abandoned US20130282428A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/651,763 US20130282428A1 (en) 2007-01-30 2012-10-15 Probabilistic inference of site demographics from aggregate user internet usage and source demographic information

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/699,745 US8290800B2 (en) 2007-01-30 2007-01-30 Probabilistic inference of site demographics from aggregate user internet usage and source demographic information
US13/651,763 US20130282428A1 (en) 2007-01-30 2012-10-15 Probabilistic inference of site demographics from aggregate user internet usage and source demographic information

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/699,745 Continuation US8290800B2 (en) 2007-01-30 2007-01-30 Probabilistic inference of site demographics from aggregate user internet usage and source demographic information

Publications (1)

Publication Number Publication Date
US20130282428A1 true US20130282428A1 (en) 2013-10-24

Family

ID=39669012

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/699,745 Expired - Fee Related US8290800B2 (en) 2007-01-30 2007-01-30 Probabilistic inference of site demographics from aggregate user internet usage and source demographic information
US13/651,763 Abandoned US20130282428A1 (en) 2007-01-30 2012-10-15 Probabilistic inference of site demographics from aggregate user internet usage and source demographic information

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US11/699,745 Expired - Fee Related US8290800B2 (en) 2007-01-30 2007-01-30 Probabilistic inference of site demographics from aggregate user internet usage and source demographic information

Country Status (2)

Country Link
US (2) US8290800B2 (en)
WO (1) WO2008095031A1 (en)

Families Citing this family (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10839321B2 (en) 1997-01-06 2020-11-17 Jeffrey Eder Automated data storage system
US8321249B2 (en) * 2007-01-30 2012-11-27 Google Inc. Determining a demographic attribute value of an online document visited by users
US20090182615A1 (en) * 2008-01-14 2009-07-16 Microsoft Corporation Self-serve direct-to-consumer mail marketing service
CN103473721B (en) 2010-12-20 2017-04-12 尼尔森(美国)有限公司 Methods and apparatus to determine media impressions using distributed demographic information
US9225737B2 (en) 2013-03-15 2015-12-29 Shape Security, Inc. Detecting the introduction of alien content
US9270647B2 (en) 2013-12-06 2016-02-23 Shape Security, Inc. Client/server security by an intermediary rendering modified in-memory objects
US8954583B1 (en) 2014-01-20 2015-02-10 Shape Security, Inc. Intercepting and supervising calls to transformed operations and objects
US9027142B1 (en) 2014-01-21 2015-05-05 Shape Security, Inc. Dynamic field re-rendering
US8893294B1 (en) 2014-01-21 2014-11-18 Shape Security, Inc. Flexible caching
US9225729B1 (en) 2014-01-21 2015-12-29 Shape Security, Inc. Blind hash compression
US9544329B2 (en) 2014-03-18 2017-01-10 Shape Security, Inc. Client/server security by an intermediary executing instructions received from a server and rendering client application instructions
US8997226B1 (en) 2014-04-17 2015-03-31 Shape Security, Inc. Detection of client-side malware activity
US9411958B2 (en) 2014-05-23 2016-08-09 Shape Security, Inc. Polymorphic treatment of data entered at clients
US9858440B1 (en) 2014-05-23 2018-01-02 Shape Security, Inc. Encoding of sensitive data
US9210171B1 (en) 2014-05-29 2015-12-08 Shape Security, Inc. Selectively protecting valid links to pages of a web site
US9083739B1 (en) 2014-05-29 2015-07-14 Shape Security, Inc. Client/server authentication using dynamic credentials
US9405910B2 (en) 2014-06-02 2016-08-02 Shape Security, Inc. Automatic library detection
US9258274B2 (en) 2014-07-09 2016-02-09 Shape Security, Inc. Using individualized APIs to block automated attacks on native apps and/or purposely exposed APIs
US10050935B2 (en) 2014-07-09 2018-08-14 Shape Security, Inc. Using individualized APIs to block automated attacks on native apps and/or purposely exposed APIs with forced user interaction
US9003511B1 (en) 2014-07-22 2015-04-07 Shape Security, Inc. Polymorphic security policy action
US9438625B1 (en) 2014-09-09 2016-09-06 Shape Security, Inc. Mitigating scripted attacks using dynamic polymorphism
US10298599B1 (en) 2014-09-19 2019-05-21 Shape Security, Inc. Systems for detecting a headless browser executing on a client computer
US9954893B1 (en) 2014-09-23 2018-04-24 Shape Security, Inc. Techniques for combating man-in-the-browser attacks
US9479526B1 (en) 2014-11-13 2016-10-25 Shape Security, Inc. Dynamic comparative analysis method and apparatus for detecting and preventing code injection and other network attacks
US9608975B2 (en) 2015-03-30 2017-03-28 Shape Security, Inc. Challenge-dynamic credential pairs for client/server request validation
US9986058B2 (en) 2015-05-21 2018-05-29 Shape Security, Inc. Security systems for mitigating attacks from a headless browser executing on a client computer
WO2017007705A1 (en) 2015-07-06 2017-01-12 Shape Security, Inc. Asymmetrical challenges for web security
WO2017007936A1 (en) 2015-07-07 2017-01-12 Shape Security, Inc. Split serving of computer code
US10356485B2 (en) 2015-10-23 2019-07-16 The Nielsen Company (Us), Llc Methods and apparatus to calculate granular data of a region based on another region for media audience measurement
US10375026B2 (en) 2015-10-28 2019-08-06 Shape Security, Inc. Web transaction status tracking
US10212130B1 (en) 2015-11-16 2019-02-19 Shape Security, Inc. Browser extension firewall
EP3414695B1 (en) 2016-02-12 2021-08-11 Shape Security, Inc. Reverse proxy computer: deploying countermeasures in response to detecting an autonomous browser executing on a client computer
US9800928B2 (en) 2016-02-26 2017-10-24 The Nielsen Company (Us), Llc Methods and apparatus to utilize minimum cross entropy to calculate granular data of a region based on another region for media audience measurement
US10567363B1 (en) 2016-03-03 2020-02-18 Shape Security, Inc. Deterministic reproduction of system state using seeded pseudo-random number generators
US9917850B2 (en) 2016-03-03 2018-03-13 Shape Security, Inc. Deterministic reproduction of client/server computer state or output sent to one or more client computers
US10129289B1 (en) 2016-03-11 2018-11-13 Shape Security, Inc. Mitigating attacks on server computers by enforcing platform policies on client computers

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6327574B1 (en) * 1998-07-07 2001-12-04 Encirq Corporation Hierarchical models of consumer attributes for targeting content in a privacy-preserving manner
US20010049620A1 (en) * 2000-02-29 2001-12-06 Blasko John P. Privacy-protected targeting system
US20030074142A1 (en) * 1997-03-24 2003-04-17 Steeg Evan W. Coincidence detection programmed media and system
US20040167928A1 (en) * 2002-09-24 2004-08-26 Darrell Anderson Serving content-relevant advertisements with client-side device support
US20040205157A1 (en) * 2002-01-31 2004-10-14 Eric Bibelnieks System, method, and computer program product for realtime profiling of web site visitors
US20050021965A1 (en) * 2002-06-10 2005-01-27 Van Horn Richard J. In a networked environment, providing characterized on-line identities and matching credentials to individuals based on their profession, education, interests or experiences for use by independent third parties to provide tailored products and services
US20070094060A1 (en) * 2005-10-25 2007-04-26 Angoss Software Corporation Strategy trees for data mining
US20070180469A1 (en) * 2006-01-27 2007-08-02 William Derek Finley Method of demographically profiling a user of a computer system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010013009A1 (en) * 1997-05-20 2001-08-09 Daniel R. Greening System and method for computer-based marketing
WO2000033160A2 (en) 1998-12-03 2000-06-08 Expanse Networks, Inc. Subscriber characterization and advertisement monitoring system
AU2001234456A1 (en) * 2000-01-13 2001-07-24 Erinmedia, Inc. Privacy compliant multiple dataset correlation system
US20020152117A1 (en) * 2001-04-12 2002-10-17 Mike Cristofalo System and method for targeting object oriented audio and video content to users
US7162522B2 (en) * 2001-11-02 2007-01-09 Xerox Corporation User profile classification by web usage analysis
US8229957B2 (en) * 2005-04-22 2012-07-24 Google, Inc. Categorizing objects, such as documents and/or clusters, with respect to a taxonomy and data structures derived from such categorization
US9235849B2 (en) * 2003-12-31 2016-01-12 Google Inc. Generating user information for use in targeted advertising
US8321249B2 (en) * 2007-01-30 2012-11-27 Google Inc. Determining a demographic attribute value of an online document visited by users

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030074142A1 (en) * 1997-03-24 2003-04-17 Steeg Evan W. Coincidence detection programmed media and system
US6327574B1 (en) * 1998-07-07 2001-12-04 Encirq Corporation Hierarchical models of consumer attributes for targeting content in a privacy-preserving manner
US20010049620A1 (en) * 2000-02-29 2001-12-06 Blasko John P. Privacy-protected targeting system
US20040205157A1 (en) * 2002-01-31 2004-10-14 Eric Bibelnieks System, method, and computer program product for realtime profiling of web site visitors
US20050021965A1 (en) * 2002-06-10 2005-01-27 Van Horn Richard J. In a networked environment, providing characterized on-line identities and matching credentials to individuals based on their profession, education, interests or experiences for use by independent third parties to provide tailored products and services
US20040167928A1 (en) * 2002-09-24 2004-08-26 Darrell Anderson Serving content-relevant advertisements with client-side device support
US20070094060A1 (en) * 2005-10-25 2007-04-26 Angoss Software Corporation Strategy trees for data mining
US20070180469A1 (en) * 2006-01-27 2007-08-02 William Derek Finley Method of demographically profiling a user of a computer system

Also Published As

Publication number Publication date
US8290800B2 (en) 2012-10-16
WO2008095031A1 (en) 2008-08-07
US20080183556A1 (en) 2008-07-31

Similar Documents

Publication Publication Date Title
US8290800B2 (en) Probabilistic inference of site demographics from aggregate user internet usage and source demographic information
US8676961B2 (en) System and method for web destination profiling
Schafer et al. Collaborative filtering recommender systems
US10108979B2 (en) Advertisement effectiveness measurements
Danaher et al. Factors affecting web site visit duration: A cross-domain analysis
Walker Sampling the Dirichlet mixture model with slices
US8799415B2 (en) Dual/blind identification
US20080028066A1 (en) System and method for population-targeted advertising
US20080005313A1 (en) Using offline activity to enhance online searching
US8639575B2 (en) Audience segment estimation
US20080004884A1 (en) Employment of offline behavior to display online content
US20120191539A1 (en) Category similarities
Bellassoued* Uniqueness and stability in determining the speed of propagation of second-order hyperbolic equation with variable coefficients
Weideman et al. Parallel search engine optimisation and pay-per-click campaigns: A comparison of cost per acquisition
US20130151311A1 (en) Prediction of consumer behavior data sets using panel data
US20100023399A1 (en) Personalized Advertising Using Lifestreaming Data
US20130173574A1 (en) Search engine optimization with secured search
US10599981B2 (en) System and method for estimating audience interest
Murphy et al. Website-generated market-research data: Tracing the tracks left behind by visitors
Prantl et al. Website traffic measurement and rankings: competitive intelligence tools examination
US20160189202A1 (en) Systems and methods for measuring complex online strategy effectiveness
Lindell et al. A practical application of differential privacy to personalized online advertising
Degeling et al. Tracking and tricking a profiler: Automated measuring and influencing of bluekai's interest profiling
US8321249B2 (en) Determining a demographic attribute value of an online document visited by users
US20110015951A1 (en) Evaluation of website visitor based on value grade

Legal Events

Date Code Title Description
AS Assignment

Owner name: GOOGLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LAW, CHING;RAJARAM, GOKUL;RANGANATH, RAMA;REEL/FRAME:033702/0629

Effective date: 20070427

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: GOOGLE LLC, CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044144/0001

Effective date: 20170929