WO2001063454A2

WO2001063454A2 - Dynamic targeting with experimentation over a network

Info

Publication number: WO2001063454A2
Application number: PCT/US2001/005596
Authority: WO
Inventors: Lutz Hamel; John Charles Croy; Nicholas D. Sherman
Original assignee: Bluestreak.Com
Priority date: 2000-02-22
Filing date: 2001-02-22
Publication date: 2001-08-30
Also published as: WO2001063454A8; AU2001238621A1; EP1259895A2; WO2001063454A9

Abstract

A method, apparatus, system, means and computer program are provided for targeting the transfer of information such as advertising over a network such as the World Wide Web that can select the information to be presented based on the characteristics of a site or group of sites and the observed behavior of individuals or groups of individuals who visit that site or sites, instead of the preferences of an individual. The selected information can also be able to change automatically to account for changing behavioral trends observed for a group of the visitors of that site rapidly over time and for targeting advertising to users without having human bias in the targeting, therefore allowing effective advertising to be presented to sites even if the relationship between the advertising and the content of the sites is not clear.

Description

DYNAMIC TARGETING WITH EXPERIMENTATION OVER A NETWORK

Background of The Invention

1. Field of the Invention

This application relates to the field of prediction and more particularly to the field of prediction of acceptance of offers.

2. Description of Related Art In many areas it is desirable to draw attention to information presented. One example is advertising. Advertisers have to draw attention to their advertisements from an audience that may or may not be interested in viewing them. This is particularly true in electronic advertising, where the advertiser is competing for attention against content that a user has searched out specifically. In order to better attract attention, advertisers have resorted to many different ways of placing their advertisement where it will be attractive to the user.

Traditionally, advertising across a network such as the Internet or the World Wide Web has been done through the presentation of a viewable window such as a click-able advertising banner. This banner is presented on a page the user accesses for the content provided and when clicked enables the user to be transferred to the advertiser's website, where the user has access to the advertiser's information.

In order to attract the eye of the viewer to these banners, such systems use a variety of techniques. For example, the systems incorporate animation or interactive displays in order to attract the viewer's attention. Systems can also provide interactive displays where a user can play a game, perform a task, or otherwise interact with the advertisement. Audio content may also be provided to allow the presentation of information outside of a visual media.

In addition, advertisers use targeting to better choose which advertisements to show to which individuals. Targeting is a concept that developed through traditional print media and television, and resolves around a very simple idea. Certain people will be interested in purchasing certain products and advertising is most effective when it is presented to those people who are more likely to be interested in purchasing the product advertised. There is therefore the desire by advertisers to find the "target" group that is interested in their product so they can target their advertising to them. Traditionally, there have been two ways of performing targeting. Either the advertiser can try and find a medium frequented by the general type of individual attracted to their product, or can try to find the general preferences of a specific individual.

In print, television, and other classic media the first method is generally used. In many cases the media is specifically designed to attract certain types of users, and therefore can be useful to certain types of advertisers. Many magazines are written to appeal to certain types of readers, for instance a magazine such as "Scientific American" is designed to appeal to individuals interested in science and discovery, "Sports Illustrated" is designed to appeal to those interested in spectator sports, and "Modern Bride" is designed to appeal to those who are soon to be married. Advertisers in these traditional media are therefore able to purchase time or space where they think their advertisement is most likely to be viewed by individuals who are more likely to be interested in their products. The problem with this type of targeting, however, is although the audience of the magazine might be readily determined, it can be difficult to determine how well the audience for the product may match up with the audience for the magazine. The readers of "Modern Bride" magazine are likely to be women interested in purchasing products clearly associated with weddings such as wedding dresses, wedding rings, and wedding consultant services. It is, however, unclear if the readers of "Scientific American" or "Sports Illustrated," which are likely to both be men, would be more interested in purchasing men's toiletries, power tools, or men's suits.

In addition to this type of problem, there is also a further one that non-traditional relationships may not be clear. For instance, it could be the case that the readers of "Sports Illustrated" happen to regularly be looking to purchase women's suits either as gifts or because the suspected readers do not constitute the total readers. This could be the case because the predominantly male subscribers does not comprise the readership since many wives or girlfriends of those same readers happen to read the same copy of the magazine. The second problem that arises from the bias is that there is imperfect information regarding who actually reads the material and therefore sees the advertising. From the above example, it is possible that "Sports Illustrated" has a readership that differs from that expected, and seen, by the publisher. This is because they can only have general information about who reads the magazine. In particular, they only see who the magazine is sold to, not who reads it. For these reasons, the advertising must be biased toward the group of individuals expected to view the material, instead of those who actually do.

As the Internet and other networks have gained increasing popularity as a method for acquiring information, advertisers have tried to target advertising for those browsing the web, and the banner advertisement on the top of a webpage has become an almost universal part of web surfing. The Internet presents a greater challenge to the advertiser than many of the more traditional mediums of advertisement. The content of the Internet is not as clear as that of a magazine, and many of the most popular sites (for example search engines) have no clear relationship to specific types of commercial goods or services. In addition, Internet banner ads have to compete against content that a user has specifically searched out, as opposed to being within a collection the reader might browse through. The failure to spot such non-traditional connections often results from selection bias on the part of those determining the target audience. In particular, there is often a human deciding which ads should be placed in which media based on the human's intuition or understanding of a connection between a particular audience and a particular message. For example, a connection of "Sports Illustrated" to men may seem correct, since there are more men than women among the magazine's subscribers. Also, from our own personal experience one might expect men to be more interested in spectator sports and thus "Sports Illustrated" should be read by men, and products likely to appeal to men are likely to be advertised. The difficulty lies in human bias in determining what should go where. While the bias might work for some easily targeted products, once one has assigned offers to particular advertising space, other products may be much harder to target. Difficult questions arise. For example, are women or men more likely to be interested in purchasing family cars or looking for a location to get employment services? Thus, should these products be advertised in "Sports Illustrated" or in "Modern Bride?" On networks, advertisers continue to use many of the traditional methods for trying to target their advertising. In particular, sites such as "Weddingchannel.com" may have advertising for wedding products. These sites still suffer from the same problems as was in print media, however, that is that human bias does not necessarily make non-traditional connections.

A further problem with a human-driven method is that it is very difficult to create a system wherein large numbers of sites can be classified in similar areas without significant errors due to human bias. For example, if a selection of 1000 websites were presented for advertising, it would be necessary to determine how to sort them into groups so as to determine how best to send advertising. Human bias surfaces again in determining these categories. If there were to be a category called "weddings" that category itself would result from human bias (i.e., there might be a better logical connection between sites that relate to weddings than the mere fact that they do so). Some better connection might offer better targeting. For example, would a site that supplies cakes and a site that supplies catering be better connected by a category that relates to "weddings" or a category that relates to "entertaining?" Arguably sites that fit in a "weddings" category are better represented by a broader category, such as "entertaining," which might encompass an audience interested in catering or cakes, but not necessarily for weddings. Many other examples of over- or under-inclusiveness result as humans try to target particular advertisements or offers for a particular category of offering based on other subject matter that is related to, but not identical with, the subject matter that they are seeking, e.g., through a search engine. The Internet has provided advertisers with the ability to try and target individual users. This method follows the idea that certain individuals are always interested in certain types of products, and those users will express those interests by expressing interest in information or products of a certain type. There are two primary methods for tracking the users in such a way. The first involves analyzing input from the user to determine what might interest them. This can be done by either soliciting information from the user, for instance through a registration form, or by analyzing what the user is looking for in content. The latter is generally done through Internet search engines where an ad having a relation to a key word in the search is presented. In this way a user looking for the keyword "wedding" could receive an ad for wedding products. The clear problem with the method, however, is that the keyword may not provide the correct relationship that is being searched for. For instance, entering the keyword "wedding" could mean the user is searching for news of recent weddings, and would have no interest in wedding products.

A second method is to try and target users based on their individual preferences. Such methods generally involve storing information on every user and determining which kinds of advertisements most interest them. Basically a complex log is kept of every user recording advertising that they have responded to and trying to present similar advertising to them. This type of advertising has two major problems. The largest is that it requires keeping information which many individuals would be opposed to providing. Since the advertisement must be customized to the user, it is necessary to keep information about that user. Many web surfers are opposed to giving out personal information, and recent controversies over browser cookies (which are a common method used to track individual users in such advertising) have erupted with increased demands for privacy on the web. These privacy concerns can result in users setting up browsers to prevent the gathering of such information, and can also result in users not responding to advertising for fear that their reaction will be recorded and will result in future targeted advertising based on that selection. In particular, someone may not click on an ad for products they are curious about, but would not want to receive advertising related to.

Secondly, the system has a problem that it must keep a history of every individual, and that history cannot show changing preferences of that individual. Users tastes change, and that change can be missed by the tracking system, or the tracking system can be so slow to respond to that change that the advertising presented is never relevant for a particular time period. Using the above wedding example, people are generally only interested in wedding products when they are getting married. By relying on an individual's prior preferences, the individual would not be presented with wedding related material at the appropriate time. Since they have never been interested in wedding related material previously, they are not presented with wedding related advertisement (they have shown no preference for it). They therefore cannot, or will only slowly, show a preference for wedding related material, and by the time such clear change has been made, the party might be already married and no longer have such an interest. This inability to cope with change is because the ability of the system to notice change depends on the activity of the user, and for many users, before a change can be noticed (through recognition of a statistical significance in changes in their interactions with advertising), preferences may have changed again.

Another technique has arisen to automate, to some degree, the connection between a category of offering and a particular subject matter is known as collaborative filtering. In collaborative filtering, a relationship is presumed between the past acceptance of a particular offer or group of offers and the likelihood of acceptance of another offer. In particular, a co-occurrence frequency can be determined between, for example, acceptance of a first offer, X, and a second offer, Y. If the relative co-occurrence frequency is high between past purchases of X and Y (i.e., if those who purchase X demonstrate a higher frequency of purchasing Y than those who have not purchased X), then it can be concluded that Y should be offered to those who purchase X. Thus, after a purchase X has been made, it can be determined to subsequently offer the item that has the highest frequency of subsequent purchase based on historical records. This approach can be extended to multiple scenarios; for example, offers can be made based on the item Z that is most likely to be purchased after one has purchased both X and Y, or after one has purchased some other greater number of products in combination. For example, one might expect that someone who purchases a new set of golf clubs and a new golf bag might be shown an offer of golf shoes, if golf shoes are the item most frequently purchased in the past by those who have just purchased the former two items. The problems with collaborative filtering are numerous. Like other approaches described above, it is based on a supposition about behavior, namely, that people's purchases of particular items are most closely related to past purchases, rather than to myriad other factors that affect a purchase. Thus, it creates a consistent bias toward offering items that are purchased in combination, in the same way that a more intuitive approach creates a variety of different, less predictable biases. In sum, in the current art, human beings place advertisements manually, making the processes inherently subject to bias and subject to the failure of them to identify trends and connections in data. A need exists for a system that obtains accurate results without requiring human action. Summary Of The Invention

Provided herein are methods, systems, and means for predicting an event, such as the acceptance of an offer in an online display. Included are various steps, including obtaining a first data set capable of symbolic manipulation, such as a plurality of pages of web content or URLs, obtaining a second data set capable of symbolic manipulation, such as a plurality of advertisements, mapping the data sets into self-organizing maps, and establishing an online learning engine for generating experiments as to the mapping of the data sets and refining the mapping based on the results of the experiments. The mapping can then be used to predict events, such as the purchase of goods or services by a user who encounters a display. Provided further is a paradigm wherein an ongoing match- learn- refine cycle can be established, permitting automated learning of optimal mappings over time.

It is therefore desirable in the art to have a system, method, and means of targeting the transfer of information such as advertising over a network such as the World Wide Web that can select the information to be presented based on the characteristics of a site or groups of sites and the observed behavior of groups of visitors or individuals who visit a site or sites, instead of the preferences of an individual, and that the selected information be able to change automatically to account for changing behavioral trends observed for a group of visitors of that site rapidly over time. It is further desirable to have a system, method, and means for targeting information such as advertising to users without having human bias in the targeting, therefore allowing effective advertising to be presented to sites even if the relationship between the advertising and the content of the sites is not clear.

In one embodiment, this invention describes a system, method, and means for displaying material on a network that includes a plurality of content providers provid liinngg content in a plurality of different areas, a plurality of displays having media content iinn a plurality of different areas, a way of organizing the plurality of content providers iinnttco content clusters based on said content where each content provider is a member of at least one content cluster, a way of organizing the displays into media clusters based on said media content, where each display is a member of at least one media cluster and a way of linking at least one of said media clusters to at least one of said content clusters in such a way that a user of the network who accesses one of the content providers in a particular content cluster is provided with the associated content and at least one display from a particular media cluster linked to that content cluster.

In a further embodiment of the invention the above system, method, or means can involve experimentation where an advertisement from a media cluster not linked to the content cluster containing the content provider is presented instead of an advertisement from the linked cluster some times and if that display proves to be more effective, the linking is changed to the new cluster, or the display is moved to the linked cluster. A learning engine using such experimentation enables the methods, systems and means disclosed herein to enable users to proactively investigate a variety of possible characteristics about mappings, including, but not limited to, predictions based on trends in customer behavior.

In a further embodiment of the invention, changing preferences of the visitors to the page can be recorded so that trends in the displays' impact can be recorded and used to plan for the placing of displays. This embodiment includes changing displays automatically in a steady curve as those changes are noticed and noticing the trends and recording them for future use.

In another embodiment, the learning engine applies time-series based data mining algorithms to predict cycles and trends.

In a further embodiment of the invention such a system, method, or means does not require gathering information on an individual and effective tracking can be accomplished without the invasion of privacy present in many current advertising schemes.

A further embodiment of the invention comprises a user interface whereby an operator can interact with the system providing linking or recognizing trends. This embodiment can also further comprise providing to the provider of the display the interface so that they can track their display's effectiveness and change the linking related to their displays if they desire. This user interface shows the effectiveness of the system and allows the user to further define associations between data sets.

In a further embodiment of the invention a system, method, or means is provided that can automatically determine a starting point for providing advertisements within web pages, that can automatically learn if an advertisement is effective or if an alternative advertisement is more effective, and that can supply the more effective advertisement automatically. This embodiment can further comprise components that allow the system to automatically adjust for changing trends that are cyclical, permanent, or random.

In a further embodiment of the invention a system, method, and means is provided whereby advertisements and content can be clustered together in a method that eliminates human bias in the selection of the clustering.

As used herein, the following terms generally encompass the following meanings although these definitions are not intended to limit the plain meaning of any term as would be understood by one who is skilled in the art.

'User' generally denotes an entity, such as a human being, using a device, such as one allowing access to a network. This is typically a computer having a keyboard, a pointing device, and an a/v display device, with the computer running software able to display computer-originated material typically received from one or more separate computers. Preferably the user's computer is running browser software enabling it to act as a client and communicate by the Internet to one or more servers. The user can, however, be any entity connected to a network through any type of client.

'Browser' generally denotes, among other things, a process or system that provides the functionality of a client, such that it interconnects by a network to one or more servers. The browser may be Microsoft's Internet Explorer, Netscape's Navigator, an Active-X enabled browser, any other commercial or custom designed browser or any other thing allowing access to material on a network.

'Client' generally denotes a computer or other thing such as, but not limited to, a PDA, pager, phone, WebTV system, or any software or hardware process that interconnects by a network with one or more servers.

'Server' generally denotes one or more computers or similar things that interconnect by a network with clients and that have application programs running therein, such as for the purpose of transferring computer software, data, audio, graphic and/or other material. Server also includes any process or system for interconnecting via a network with clients. 'Symbol' or 'symbolic' generally denotes data that is represented by one or more symbols and that includes both data that can be represented numerically as well as data that is represented non-numerically, such as images, text, words, figures, symbols, and the like. 'Viewable window' generally refers to any display on a browser that is a component of another display. The viewable window may contain any kind of display. A viewable window includes but is not limited to, a computer window, an advertising banner window, or an HTML call to an image file. 'Advertising' generally denotes a presentation of material which has an at least partial content or component with advertising purpose or connotation. It may include, but is not limited to, solicitation, advertising, public relations or related material, news material, non-profit information, material designed to promote interest in a product or service, information enabling a user to search or view other content providers, or other material that might be of interest to the user.

'Display' generally denotes a visiographic image that is designed to be viewed by a user. A display can include, but is not limited to, advertising, visual information, text, graphics, images, photographs, animation, 3D displays, audio, interactive activities, any other type of material, or any of the previous in any combination. 'Word' generally denotes any type of text or language arranged into a discrete block. Words do not only comprise words in the English or other spoken or written language but can comprise, but are not limited to; words in any language, natural or artificial, including computer programming languages, machine readable languages, and machine languages; any collection of letters in any alphabet or combination of alphabets; a number or collection of numbers; a name such as a proper name or the title of a computer file; or any of the previous in any combination.

Brief Description Of Drawings

Fig. 1 is a high level view of one possible system of the invention. Fig. 2 is a more detailed view of a host system and related elements in the embodiment of Fig. 1.

Fig. 3 is a schematic showing a matching, learning and refinement cycle as disclosed herein.

Fig. 4 is a schematic of certain clustering steps in a data mapping process that is disclosed herein.

Fig. 5 is a flowchart showing one example of the steps for mapping webpages. Fig. 6 Is a flowchart showing an example of how to cluster webpages by content based on one embodiment of the invention.

Fig . 7 shows an example of how experimentation can be used to optimize mapping.

Fig. 8 shows one example of a deployment workbench of the invention. Fig. 9 shows the deployment workbench of Fig. 8 showing a single link of an advertising cluster to a webpage cluster.

Fig. 10 shows the deployment workbench of Fig. 8 with many links generated.

Fig. 11 is a schematic depicting elements of a learning engine as disclosed herein.

Fig. 12 shows a flowchart of one embodiment of the experimentation routine that leads to selection of improved linking.

Detailed Description of the Preferred Embodiment(s)

As an embodiment of the subject invention, the following descriptions and examples are discussed primarily in terms of the method executing over the World Wide Web utilizing Internet Java software executing within a browser and C++ software executing in a server, such as an Apache web server or other server capable of storage and manipulation of data structures and capable of serving content over a computer network, such as the Internet. Alternatively, the present invention may be implemented by Active-X, C++, other custom software schemes, telecommunications and database designs, or any of the previous in any combination. In an embodiment, the invention and its various aspects apply typically to the targeting of offers, such as advertisements, to a user of a personal computer equipped with visual graphic display, keyboard, mouse, and audio speakers, and equipped with browser software and functioning as an Internet World Wide Web client. In an embodiment, the invention also comprises the production of advertising to such a user as part of the visual and potentially audio content of a webpage. However, alternative embodiments will occur to those skilled in the art, and all such alternate implementations are included in the invention as described herein.

A consumer using a browser or an HTML viewing device may view a webpage, which may be downloaded and rendered as HTML. The HTML could include, but is not limited to, browser plug-in program codes, Java applet code, Active-x, XML references (such as from using XHTML), and/or any built-in HTML codes. Referring to Fig. 1, a high-level schematic of a system in an embodiment of the invention is provided. In particular, one or more content providers 111 wish to provide targeted content over a computer network 150 to an end user 105 who is running a browser 107 that is connected to the computer network 150. In an embodiment depicted in Fig. 1, the content may be provided to a server 109 that serves content over the computer network 150. It should be understood that an independent host 160 or the content provider 111 might serve the content, and it should be understood that a particular host 160 might serve content from a single content provider 111 or from many different content providers 111, in different embodiments of the invention. Content of the content providers 111 might be stored in one or more databases 101, which may store advertising content and any other content suitable for viewing by the user 105 via the web browser 107. Alternatively, the content might be stored in a computer file or generated dynamically as it is served. In embodiments, content from various sources may be displayed simultaneously in proximity to other content on a user's display device; for example, in a conventional manner, a web page may be displayed with an advertising banner, and the banner, or the page, may be changed while the other remains static.

Referring to Figure 2, a basic embodiment of a system in accordance with the present invention is depicted. In this embodiment there are a selection of displays within a display database 102, which may be included among the content databases 101, and which may be desired by the owners of the displays to be provided to users 105 of the network who may be interested in other content, such as content included in other content databases 101. Those displays may also be generally designed to be displayed to the user 105 as part of a viewable window on a user's browser 107 when the user 105 views content from a content provider 111 through the server 109. These displays might generally comprise banner ads with advertising content or other offers of products and services, but the displays could alternatively have any type of content as would be clearly recognizable to one of skill in the art. The content providers 111 will generally comprise website owners or other merchants, and their content will comprise information presented as a web page over the Internet. In embodiments, a source provider 113 may interact with the system. A source provider 113 may generally be company who represents numerous content providers 111 who desire to have viewable windows within their content in exchange for some value. Generally the source provider 113 may be an advertiser network that has a collection of content providers 111 that have expressed a desire to be paid to have banner advertising displayed as part of the content on a webpage they provide. They will come to the owner of the display database 102 in order to purchase advertising displays and advertising services for use in those viewable windows provided by the content providers 111. Alternatively, the source provider 113 could be any type of entity that is a content provider 111, or has a list of content providers 111, desiring any kind of displays for a viewable window for any reason.

It should be understood that while viewable windows are understood to be an embodiment of a vehicle for delivery of web content, the methods, systems and means disclosed herein could be used for targeting other content, such as audible content played through a .wav file, MP3 player, speaker, or similar mechanism, a file downloaded to the user's computer, placement of a cookie on a user's browser, or other targeted information delivered to users of the Internet. Thus, embodiments that refer to viewable windows herein should be understood to encompass any content that is delivered to users over the Internet.

Once a content provider 111 or the source provider 113 has determined to supply displays (e.g., if a display database owner 102 and a source provider 113 have agreed to supply displays for the viewable windows of content providers 111), various processes of the host system 160 can be brought into operation. Those processes may include a content discovery unit 115, a display selector 1 17, an experiment generator 119, and a display discovery unit 123. These units may consist of software processes, or a combination of hardware and software. Each of these units or engines may operate to provide the functions described below. Referring to Fig. 3, a high level schematic of various functions of a system as disclosed herein is provided. At its highest level, the systems, methods and means described herein can be thought of as providing a cyclical function involving a step 310 of matching or mapping one or more data sets to one or more other data sets, a step 312 of online learning about the quality of the mapping, and a step 314 of refining the mapping identified at the step 310 to reflect the learning. Many different embodiments may be envisioned for each of the steps in this cycle. Certain preferred embodiments for the steps are disclosed herein.

It should be understood that the cycle depicted in Fig. 3 and the systems, methods, and means disclosed herein might be applied in a wide variety of contexts. The systems, methods and means are of particular utility in contexts involving data sets that include not only numerical, but also symbolic content; i.e., content that has symbolic meaning that is not entirely reducible to numbers. Examples include stock data, financial data, matching of auctions with people, television advertising to viewers, targeting of video-on-demand offerings, and, in a preferred embodiment, mapping of advertising offers to content sets that are viewed by Internet users. In the advertising context, advertisers typically have data sets that can be located in the display database 102. It is desired to have a system for initially matching the displays to appropriate content. It is further desired to learn what displays are best matched with particular content, then to modify the matching of the data sets to provide an optimal mapping of displays to content. Referring to Fig. 4, a high level schematic is provided that identifies an example of an initial mapping of two data sets. In an embodiment, a self-organizing map technique may be used to obtain an initial mapping of displays and content. Thus, at a step 252 the system obtains a content data set, which may include symbolic (i.e., not only numeric) data. The system may then, at a step 254, cluster the content data set according to content, using a technique such as a self-organizing map technique described below. The system, method and means disclosed herein may also obtain a display data set at a step 258, and cluster that data set at a step 260, again using a self-organizing map or similar technique. Once the two content sets are clustered according to categories or neighborhoods of content, then they can be mapped at a step 262, which may be done automatically or by human intervention, as described more particularly below. This clustering and mapping step may be viewed as a pre-processing step that precedes learning and refine steps in the cycle depicted in Fig. 3. It should be noted that any mapping of data sets of any type can be used to seed a learning and refinement cycle; that is, the cycle of learning, refinement, and further mapping is not dependent on the particular clustering methods disclosed herein, on clustering methods generally, or on any particular mapping. Examples of other initial mappings that are encompassed within the invention include random mappings of display data to content data, statistical mappings, cookie-based mappings, mappings based on collaborative filtering, mappings based on instance-based learning algorithms such as those described in Machine Learning. McGraw-Hill 1997 ISBN 0-07-042807-7 herein incorporated by reference or any other mappings. For example, the system could start with raw URLs, without any understanding of their content. The learning and refinement system will optimize the mapping, regardless of the initial mapping; however, the speed and effectiveness of the system will be enhanced with preferred mapping techniques, such as the self-organizing map clustering techniques disclosed herein.

Further details of an embodiment of a mapping step will now be disclosed. Referring to Fig. 2, in an embodiment, the content discovery unit 115 may retrieve content from one of the content databases 101, which may be populated by data from one or more content providers 111 in a conventional manner. Next, the content discovery unit 115 may cluster the content from the content databases 101 according to the nature of the content. In an embodiment, the content is sorted into clusters according to the content provider 111 itself. Once the content discovery unit 115 has sorted the content into clusters, a display selector 117 selects a display to be provided to the server 109 when a content provider's 111 content is accessed by the user 105 on their browser 107 by following a link from the content cluster that includes that content to a cluster of displays which are desired to be provided to users 105 who are viewing content in that cluster. In order to acquire that linking, the display selector 117 can obtain clustering information from a display content discovery unit 123 which clusters displays in the display database 102 according to the content of the displays. The display selector 117 provides a mapping of clusters of content from the content databases 101 (e.g., according to content providers 111) to displays from the display database 102. In an embodiment, the display selector 117 also accesses an experiment generator 119 which can provide for a further mode of selection for displays from the display database 102 by providing experimental displays that are from other clusters, rather than from the previously mapped cluster. The experiment generator 119 in conjunction with the analyzer 120 can also further be responsible for determining when these experimental displays are preferred to the mapped cluster of displays and can then change the mapping of the clusters in the display selector 117. Thus, the display selector 117 and experiment generator 119 operate in concert to supply the match-learn-refine cycle of Fig. 3.

The display selector 117 can determine the content provider 111 being accessed by the user 105 by any method known to the art. In one embodiment, the display selector 117 is told the URL the display is being shown on, when the initial applet call for a display is received. That is, the applet calling for the advertising banner also provides the URL back to the system. This allows the display selector 117 to select the display and send it expediently back in response to the call for the display. In order to better understand the invention, these components will be discussed in detail below. All the following description relates to embodiments of the invention and it would be clear to one who is skilled in the art that alternative components of the system could be used.

Fig. 5 shows, in flowchart form, certain possible actions of the content discovery unit 115 in one embodiment of the invention. These steps may be viewed as a preprocessing step that precedes the continuous match-learn-refine cycle of Fig. 3. In this embodiment of the invention, the content discovery unit organizes the content providers 111 such that each content provider 111 or item of content is associated with a specific vector, which represents its content. In an embodiment, the vector is determined through the words or phrases that are contained in the content. Alternatively the content discovery unit could assign an association between the content provider and a classification other than a vector, and/or could base that assignment on something other than the content as represented by the language contained. This includes, but is not limited to, URL, owner, text, general category of service provided, colors appearing on the web site, images and other multi-media components that can be classified and fed back into the SOM system, land-based data sets, such as those based on zip codes, census data, or the like, or any other data, whether or not alphanumeric.

Referring still to Fig. 5, in an embodiment, the content discovery unit 115 analyzes the feature content of every web page and assigns the page an n-dimensional vector representing the feature sets of the site. For example, feature sets could include the URL, words used by the site, sounds, images, or other multimedia content. That is, these features can comprise any text or other features provided as part of or associated with the website, including, but not limited to, the text presented, the underlying HTML code, the URL address, the image file names, or any other text present as part of the web site, multimedia contents, or other features, whether viewable by the user, part of the programming, or hidden from view. In addition, the content discovery unit 115 could use any subset of features available as part of or associated with the website. The content discovery unit 115 depicted here may first, in a step 201, select a web page to analyze from the list of webpages provided by the source provider. Next, in a step 203, the content discovery unit 115 may send out a web crawler or spider 203 to that web page. In a step 205, the crawler may then filter the content of the page into a feature set representative of all the features associated with the web page and the number of times that each feature appears. The feature set may then be simplified through additional steps to better use the features to represent the content of the website 207. For example, the feature set could be modified to eliminate words that do not encourage cataloging of the content (for instance "the", "and", or "it") from the calculation by consulting a list of basic words or a stop list and deleting them from the feature set. The organizing could also comprise a stemming step where words are reduced to their stems in a truncated form, allowing similar words to be grouped together as is well known to the art. For instance "invite", "invitations" or "invited" could be represented by the truncated form "invit!" using a stemming technique. The feature set could be further modified using a statistical dictionary of common words for websites. As opposed to a stop list where the words do not aid in understanding the content no matter their use, a statistical dictionary eliminates words because they are overly used on web pages and therefore cannot be used to distinguish sites. Words such as "Microsoft", "computer", or "www" could be eliminated as being overly common on the network and providing insufficient information by which to differentiate the providers. Finally, words which are combinations of other words, especially where the combination does not result in a word in the English, or other, language could be broken into their components. This step would be primarily useful with respect to the URL of the site since the URL often contains multiple words connected. For Instance, Weddingchannel.com could be sorted using all these steps to "wed!," "channel" and ".com", which is eliminated as being overly common. In addition, various statistical techniques could be used to weight some or all words in the feature sets. For example, various statistics relating to term frequency, such as Robertson's term frequency formula, and the like, provide weighting of relevance of particular terms in accordance with their frequency of use in a particular document, and the inverse of their frequency of use in the entire set of web pages or other documents that are being examined. Such statistics may be normalized according to various statistical techniques known in the art. Thus, a weighted feature set may be determined for a web page, based on the words that appear, as well as the weight assigned to each such word.

After the feature set has been modified, it may be stored in a manner analogous to storing of an n-dimensional vector associated with the content provider 111 it is from in a step 208. This can be viewed as analogous to mapping of an n-dimensional vector, such as would occur in assigning a direction to every feature (e.g., a word) that is present in the final feature set with a distance in that direction equal to the number of times that feature appears (perhaps modified further to reflect other statistics, such as the inverse document frequency of a word in the set of documents as a whole). The system then determines at a step 211 if there are other web pages (and their associated content providers) who have not yet been assigned. If there are such additional webpages, the whole process is repeated for a new site by returning to the step 201. Once all the sites provided have been processed in this manner, the content providers 111 or other units of content are ready to be clustered in a step 209. Any method of clustering or mapping can be used with the present methods, systems, and means, but in a preferred embodiment a Self-Organizing Map (SOM) of the type described by Teuvo Kohnen in "Self-Organization of Very Large Document Collections: State of the Art" Proceedings of Eighth International Conference on Artificial Neural Networks, 1998 Voll, pp. 65-74, Springer- Verlag, 1998, herein incorporated by reference and attached hereto as Exhibit A, is used. A clustering method using a SOM according to this reference is depicted in Fig. 6. In this case the SOM follows the steps of generating an m-dimensional space from all the words (directions) in every vector associated with content providers 111 or other content units that are to be clustered in a step 301. This will generally be every content provider 111 or every unit of content that has had a vector assigned to it, but it can be a selected subset of those content providers. The value of m will comprise the total number of different words in all vectors. Each word is assigned exactly one dimension and the total comprises the m dimensions. Since the vectors were originally n-dimensional, they are now made m-dimensional in a step 303. If a vector does not contain one or more of those m-dimensions, mi, then the value in the mi direction for that vector is assigned to zero. In a step 305, the SOM then goes through and according to its mapping method begins to arrange the vectors in such a way that they can be displayed two-dimensionally in a way that still maintains a visual sense of their relationships in two dimensions.

In one embodiment of the invention, the final map comprises a two-dimensional map of the points. Because of the nature of the mapping process of a SOM, points close to each other on the two-dimensional map are considered to be more similar than points that are far apart. As part of its computation, SOM can create regions within the two- dimensional map which are called clusters. Feature vectors assigned to these clusters can be considered very similar. Alternatively, clusters could be drawn within this map by any other method known to the art. The clusters can comprise any number of individual content providers as is desired, and each cluster will generally not be of equal, or even similar, size. By carrying out these steps it should be clear that there will eventually be clusters of content providers where they have been clustered based on similarities in the language, and therefore content, of their web pages.

Once the clusters have been created they are provided to the display selector 117 to be matched with appropriate displays when that page is accessed 307. Fig. 8 shows one potential layout for providing the clusters to the display selector 117. In the layout in Fig. 8 and in SOM-terminology these clusters are often referred to as "content neighborhoods" since the clusters are also interrelated by the location on the two-dimensional map of the vectors in a similar manner to neighbors displayed on a street map. The display content discovery unit 123 can use a similar method to select content similarities as the method described above and in Figs. 5 and 6. The display content discovery unit 123 however will go through displays of media instead of the other content of the content providers 111 in order to cluster the displays. In one embodiment of the invention, the display content discovery unit 123 provides the clustering of the displays in a similar format to the clustering of the content providers, allowing an operator using an embodiment of the display selector 117 to link the two types of clusters. Alternatively, different methods could be used to cluster displays and the content providers.

Although the content discovery unit 115 and the display content discovery unit 123 have been described above in significant detail using significant specialized organizational methods such as a SOM, it should be clear to the user that many variations of mapping of content providers and displays based on their content, that are known now or are discovered in the future, would be apparent to one of skill in the art to be covered by this invention. In particular, this invention includes applications such as, but not limited to, using other methods of clustering as opposed to SOM, using other methods of cataloging the language content of the content providers or displays, providing clustering represented in any number of dimensions, random mapping, mapping based on instance-based learning, mapping based on collaborative filtering, statistical mapping, or mapping using any of the previous in any combination.

Figs. 8, 9 and 10 show one possible interface that can be a component of a display selector to select displays that are in clusters linked to the cluster of the content provider. In particular, it shows a deployment workbench which could be used by an operator, such as a human being, to link the sets of clusters together through an intuitive graphical interface. The interface depicted is intended to be human-operated, although such interface could use mechanical help routines or could alternatively be completely automated. In this embodiment, the operator first sees a display whereby different clusters of both displays (advertisements in this case) 501 and content providers 111 or other content (specific webpages in this case) 503 are arranged parallel to each other. There is no need that the maps be displayed with this similarity in a deployment workbench, but such similarity has advantages for human operators. In addition, the operator can see a list of cataloging terms for each of the clusters of content 505 and displays 507. In this figure, the cataloging terms were chosen by the SOM as representing the most important word in its clustering decision. They are therefore provided to allow a human operator to have a method for quickly seeing what the clusters have in common, wordwise. In Fig. 9 the user has connected two clusters 601 and 603 by clicking and dragging a line between the two clusters 605. Therefore this line shows a simple representation of the link between the clusters. In Fig. 10, multiple lines 701 have been created linking many different areas. The user can now see a representation of how many ads are to be presented. Also, since the interaction is simple and intuitive, the interface can be provided to an advertiser allowing them to easily modify an advertising scheme by the simple point and click actions.

As was discussed above, there is no reason that a human operator is necessary as part of the display selector 117, or the human operator can be aided by a machine as part of the display selector 117. In particular, in Fig. 10 the multiple lines 701 are shown to represent the linking of various clusters as generated by a computer. The computer may use a variety of algorithms, such as determining for each cluster of content provider 111 or content which item of display content has the highest sum of co-occurrence of words, perhaps weighted according to the frequency of certain words in the document set as a whole, to determine that the clusters should be linked. In this interface, the user could now choose to override or modify the computer's selection, providing a semi-automated system where a human operator can be aided by a machine. Alternatively, the same method used above to provide aid and suggestions to the user could be used to completely automate the selection system and eliminate the need for a human operator. The computer can simply make the best choice connection for all the clusters and then use that linking.

As part of this invention, there is provided an interface like the deployment workbench above that can be part of a business of selling advertising. In particular a seller of advertising space can generate a deployment workbench for a client where the advertisement supplied as the displays comprise the ads mapped and the websites mapped comprise those sites where the user would like to, or is able to, supply his ads. The advertiser could then generate an initial mix of ads and, if they were unsatisfied with the performance of some or all the ads, change the ads available or rearrange the mapping of content to displays to look for more successful combinations. Such a business method would take the control for choosing which ads to display where from an advertising executive and provide direct and immediate control to the advertiser. The interface could be supplied directly (for example as software), or could be supplied as part of a network interface where the advertiser could create, modify, and place their advertisement into the system, and could then modify its progress on the network. Referring to Fig. 11, once a mapping has been determined as part of the matching step 310 of Fig. 3, a continuous match-learn-refine cycle can be initiated. The learning step 312 of Fig. 3 can be accomplished by a learning engine 1102, which in turn may include an analyzer 120 and the experiment generator 119. The learning engine 1102 can be an instance-based, online, machine-learning engine. The experiment generator 119 can generate hypotheses as to alternative mappings of displays to content that might (or might not) achieve better success. The analyzer 120 can evaluate the results of each experiment of the experiment generator 119 by comparing against past experiments, and can suggest the best performing mapping. Based on the analysis by the analyzer 120 of each experiment with the experiment generator 119, the refinement step 314 of Fig. 3 can be implemented. Also, the learning engine can apply time-series based data mining algorithms to predict cycles and trends. Fig. 7 shows a possible embodiment of a display showing the results of a refinement process. The preference bar 1301 shows a desired mapping based on a randomly generated "ideal" mapping. The display history 1303 shows which display had been mapped to that particular content cluster. The changing shades in the display history 1303 show where displays have been selected until all displays match the "ideal" mapping. The display selector 117, as part of its operation, can also consult an experiment generator 119 in selecting what display to provide when a particular content provider is accessed. The experiment generator 119 may be designed to choose displays, not from a cluster that is linked to the content provider being accessed, but from another cluster. The purpose of the experiment generator 119 is to provide a display that is not provided based on the above-referenced mapping, to see if it is a preferred choice to the display determined in the mapping. In advertising, the experiment generator 119 may find non-traditional mappings that happen to exist by periodically testing to see if the assumptions behind the links are valid. By doing this, the system can "learn" if there are advertisements that are better in certain environments even if such a connection is not intuitive to a human operator. The system eliminates human bias because it constantly searches for better mappings that may not be subject to error due to human bias.

One of the things desired in an experiment generator 119 is that the content cluster or other mapping be accessed often enough so that its experiments can have statistical significance. In one embodiment of the invention, 1 in every 10 ads provided to a given cluster would be an experiment with the sample size set appropriately at every experiment time frame to be statistically valid. In addition to detecting human bias, the system can actively try to destroy human bias, as well as recognize trends that result in the changing of the users' preferences. In particular, the experiment generator can carry out its experiments, and if it determines that an experimental cluster of displays is doing better on a particular cluster of content providers or with a particular set of content, it can change the mapping, so that the better- performing cluster is newly mapped to that content cluster and the old mapping is deleted.

If the experimentation process is carried out long enough, it is possible to make a system, such as an instance-based learning system, that can learn what are the best connections. This system can therefore be implemented as a fully automated display selector whereby an optimal display cluster can used with the content providers of a particular content cluster. This is shown in Fig. 12. In the fully automated embodiment of the invention, both the content providers and the displays are mapped by the system in a step 801. The system then automatically chooses a best fit for mapping the content clusters with the display clusters and begins providing the displays according to those links in a step 803. The system also selects an experiment display that is not in the display cluster 804 and performs experiments using displays that are not mapped to particular clusters in a step 805. Experimentation can be performed by examining behavior of a collection of users 105 who are interacting with the original displays mapped to a particular content cluster relative to the behavior of a collection of users 105 who are interacting with the experimental alternative display or displays. User behavior measured for such experiments might be any type of behavior measured through the Internet, such as, for example, the rate of acceptance by users 105 of an offer displayed in a display, the amount of time a user spends with the display displayed on the user's browser before moving on to other content, the user moving the user's pointing device toward the display, the user clicking on the display, or other actions. In a preferred embodiment, acceptance of an offer associated with a display is measured for the initial display mapped to a cluster and for an experimental display. It should be understood that more than one experiment might be run for a particular cluster at any given time in different embodiments of the system. Since the mapping is automatic, experiments may be run in real time, reflecting actual user preferences for particular displays for a particular cluster at any given moment. Thus, by such experiments, the system determines if there is a better selection for mapping the clusters in a step 807. If there is an improvement, the system changes the mapping in a step 809 and, in either case, selects a new experimental display in a step 804 and begins performing experiments again in a step 805. In this fully automated system, the system always strives to present the best displays on the sites by constant updating in search of the best mapping. This system can also adapt for changing circumstances, and for shocks. If it turns out that toys were popular as children's gifts, but due to changes in parental attitudes, audio devices became more popular, the system would detect the shift by noting a higher rate of acceptance of the audio equipment advertisements for a cluster, relative to a lower rate of acceptance of toy advertisements. It will therefore start to place displays related to audio equipment instead of toys in these places. It is also important to note that since the system recognizes global trends, it does not try to predict the individual's trends. In particular, in the wedding case above, due to external motivation the individual now looking for wedding products is more likely to visit sites where wedding products are globally popular. Thus, he is immediately targeted for his new change. The concept recognizes that changes to global behavior simulate changes to individual behavior.

Beyond its ability to notice changing trends, the system can also notice and adapt to timing considerations. As a hypothetical, people might be more interested in cruises during the summer (longer vacation time), but airfare in the winter (flying for holidays). The system will notice as winter approaches that the links to cruise advertisements are not as popular as the experiments with airfare advertisements. As this trend is captured, the system will automatically start providing additional airfare advertisements as the winter approaches. In the summer as the trend shifts back, the computer notices the trend and begins to shift the advertising back to cruises. This system is sophisticated enough that it can follow many types of shifting preferences automatically adjusting the ads as the preferences shift. In addition, such trends may be pre-determined by human users, so that the initial mapping of offers may be tied to external factors, such as time of day. That is, experiments can be run that store offer acceptance according to such other contexts, such as time of year, time of day, and time of week, rather than merely updating according to offer acceptance in real time. The mapping can then migrate from such an initial mapping as described above. In addition the system can deal with quick shifts assuming that it has enough displays shown that the experiments become statistically significant quick enough. Using the above example it may actually be the case that people book airfare in the morning and cruises in the evening. So long as there was enough hits during the middle of the day, this trend would be noticed by the computer and could be incorporated into the mapping. In order to insure that such quick changes can be recognized, choosing an appropriate experiment time frame to insure an appropriate number of experiences is crucial. In one embodiment of the invention, the experiment time frame can be set by a human operator in such a way that experiments will reach statistical significance in a reasonable time. This embodiment gains enough hits to meet the above speed desire.

In addition to the automatic learning shown by the program above, the system can also learn about specific trends and can begin changing even before the experimentation registers a shift. In particular, the system can further include programming instructions, or instructions to provide data to human analysts, where it examines the types of mapping to certain clusters of periods such as days, weeks, or years and searches for trends. The searching of trends over periods of time is well known to the art. If a trend is noticed (for instance the daily trend in vacations above), the computer can now set up so that it automatically switches links, or begins switching links at a certain time. Thus, the computer can learn and make even the experimentation steps more effective. By learning such trends it can be that the final user will much more regularly be presented with an ad that is of interest. This allows an increase in their satisfaction as they are connected to products or services they desire, while at the same time making the advertising more cost- effective for the advertiser.

In another embodiment of the invention, the learning can be taken so that advertisements can be presented to the system, and it can select the best and worst as well as tracking trends in advertising, even as the advertising it is using is changed. As can be seen from the above embodiments, if the advertising available to the system is static, the system will eventually reach a point where the experiments decrease in effectiveness (because the mapping is to maximized locations). This situation only occurs if there are a limited number of advertisements (or content provider clusters) available. In this embodiment the content providers and displays available are in regular change so that the clustering of both is likely also to be regular change.

By operating a system that continually tests, and can be presented with new information, it is possible to test multiple different advertising campaigns simultaneously to decide which is most effective, and to do so in real time without a need for advance testing. The ads are tested with real consumers, and the most successful are used, while the least successful are automatically eliminated. This change could also be used across industries where customers of the display database owner could be provided quick information that their new advertising campaign is not proving as effective as their competitor's. Finally, the system could comprise many ads from similar manufacturers, covering the entire year, and the system could know when it is best to begin showing Christmas, Hanukkah, Halloween, or Valentine's Day ads and automatically show them without input from the advertiser.

The combination of an initial mapping, such as a SOM mapping, with the regular generation of hypotheticals or experiments, permits the creation of an ongoing, cycle, learning system that can optimize offers in real time. The learning and refinement cycle can also permit human or automatic intervention, by conducting experiments and refinements based on any type of prediction, ranging from selecting the offers most highly selected in a recent time frame, experimenting with offers based on collaborative filtering, experimenting based on intuition about trends, experimenting based on co-occurrence frequencies of acceptance of certain offers, experimenting using Bayesian or random walk approaches, or other predictive or statistical techniques. The system can be configured to accept any kind of data for use in experimentation, whether it be based on advertisements or other data, such as statistics kept in external databases. The systems used herein can be used in any situation where it is desirable to predict behavior based on data sets that can be presented symbolically and in which there is a desire to have optimal matching of data sets. Embodiments where symbolic data sets may be particularly useful include the travel industry, the financial industry, the securities industry, the home buying industry, shopping engines, shopping 'bots, and the like. In another embodiment of the invention, a virtual advertising network can be created, analogous to hardware virtual IP address systems or an autonomous system (e.g., blocks of URLs that are linked regardless of content). In particular, if an advertiser wishes to deliver advertising content, the advertiser must currently place advertisements through multiple providers, or the advertiser will only capture a small part of the market. Therefore, advertisers typically place ads with a variety of different networks. An example is a so- called "media buy," which involves placing ads on a set of individual portals or ad networks. A virtual network allows further refinement of that set and allows the buyer to cross traditional boundaries between traditional networks. The virtual network is provided by a host, who can place media buys with multiple networks. The host can then aggregate media buys and place advertisement across multiple networks. Once the content for multiple networks is aggregated, the host can then apply the systems and methods disclosed herein to identify the optimal placement of advertisements across multiple networks of ads, while also providing users with a single source for buying, optimizing and placing advertisements. Among other things, virtual networks can be priced independently, based on success of the optimization. Virtual networks can also be used to identify the best targeted offers within the network. The network can be made to shrink and grow so as to determine an optimized size of the virtual network. Also, collections of virtual networks can be evaluated using the mappings disclosed herein, to determine an ideal network for a particular user. Also, sets of virtual networks can be established, i.e., networks of networks, so that offers can be matched in an optimal manner over the supersets, as well as each network. It should be noted that other targeting mechanisms, such as collaborative filtering, profiling and the like can be performed over the virtual networks disclosed herein. Pseudocode for certain embodiments of the invention is disclosed as Exhibit B. While the invention has been disclosed in connection with the preferred embodiments shown and described in detail, various modifications and improvements thereon will become readily apparent to those skilled in the art. Accordingly, the spirit and scope of the present invention is to be limited only by the following claims.

Claims

1. A method for displaying material on a network comprising: obtaining content from a plurality of content providers in a plurality of different areas; obtaining a plurality of displays having media content in a plurality of different areas; organizing said content into content clusters, where each item of content is a member of at least one content cluster; organizing said displays into media clusters based on said media content where each display is a member of at least one media cluster; and mapping at least one of said media clusters to at least one of said content clusters; whereby a user of said network who accesses one of said content providers that is within a particular content cluster is provided with the associated content and at least one first display from a particular media cluster linked to said particular content cluster.

2. The method of claim 1 further comprising; selecting at least one second display within at least one second particular media cluster, wherein said at least one second particular media cluster is not mapped to said particular content cluster and said at least one second display is displayed instead of, or in addition to, said at least one first display.

3. The method of claim 2 further comprising; comparing whether said user prefers said media content of said at least one first display or of said at least one second display.

4. The method of claim 3 wherein said comparing is repeated across multiple users accessing said one of said content providers.

5. The method of claim 4 further comprising: changing said mapping based on the success of said at least one second display relative to said at least one first display.

6. The method claim 5, further comprising: mapping said particular content cluster to said second particular media cluster.

7. The method of claim 1 wherein said content comprises the text of a web page.

8. The method of claim 1 wherein said media content comprises advertising.

9. The method of claim 1 further comprising an interface whereby a first operator can manually create the mapping between any content cluster and any media cluster.

10. The method of claim 9 whereby a second operator can manually change the mapping between any content cluster and any media cluster.

11. The method of claim 10 wherein said first operator and said second operator comprise the same entity.

12. The method of claim 10 whereby said second operator comprises the owner of said media content or said display.

13. The method of claim 10 whereby said first operator comprises an independent host.

14. The method of claim 1 wherein said clustering occurs through the use of a self- organizing map and a neural net.

15. The method of claim 1, wherein said content of said content provider includes the URL address of said content provider.

16. The method of claim 1 wherein said clustering comprises generating a feature set vector to represent the language used in said content.

17. The method of claim 1 wherein which one of said content providers accessed by said user is determined by sending the URL of that content provider with a call for said display.

18. A method of doing business, comprising: having a plurality of media for presentation to a user; waiting for a user to access a content provider, said content provider having content and a request for media; receiving said request for media and providing media in response to said request for media that is targeted for said user without having to know any personal information of said user.

19. A method of targeting advertising over a network comprising: obtaining a plurality of content items; clustering the content items into a plurality of content clusters based on at least one of the words and other multi-media used in the content items; organizing the clusters into a plurality of self-organizing maps based on vectors representative of at least one of the words and other multi-media used in the content items; obtaining a plurality of displays; mapping at least one of the displays to at least one of the content clusters; and targeting a display to a customer based on the mapping.

20. A method of claim 19, further comprising: experimenting with a plurality of displays to determine a preferred display; and altering the mapping of at least one display to at least one content cluster based on the experiment.

21. A system for displaying material on a network comprising: a first database for storing content from a plurality of content providers in a plurality of different areas; a second database for storing a plurality of displays having media content in a plurality of different areas; a content discovery engine for organizing said content into content clusters; a display selector for organizing said displays into media clusters based on said media content where each display is a member of at least one media cluster and for mapping at least one of said media clusters to at least one of said content clusters; and a server for serving over a computer network a selected display to a user that is based upon the content cluster associated with content selected by the user in use of the computer network.

22. The system of claim 21, further comprising: an experiment generator for selecting at least one second display within at least one second particular media cluster, wherein said at least one second particular media cluster is not mapped to said particular content cluster and said at least one second display is displayed instead of, or in addition to, said at least one first display.

23. The system of claim 22, wherein the experiment generator is capable of comparing whether said user prefers said media content of said at least one first display or of said at least one second display.

24. The system of claim 23, wherein said comparing can be repeated across multiple users accessing said one of said content providers.

25. The system of claim 24, further comprising: a display selector for changing said mapping based on the success of said at least one second display relative to said at least one first display.

26. The system of claim 25, further comprising, a mapping engine for mapping said particular content cluster to said second particular media cluster.

27. A system of claim 21 , wherein said content comprises the text of a web page.

28. A system of claim 21, wherein said media content comprises advertising.

29. A system of claim 21, further comprising: an interface whereby a first operator can manually create the mapping between any content cluster and any media cluster.

30. The system of claim 29, wherein a second operator can manually change the mapping between any content cluster and any media cluster.

31. A system of claim 30, wherein said first operator and said second operator comprise the same entity.

32. A system of claim 31 , whereby said second operator comprises the owner of said media content or said display.

33. A system of claim 31, whereby said first operator comprises an independent host.

34. A system of claim 21, wherein said clustering occurs through the use of a self- organizing map.

35. A system of claim 21, wherein said content of said content provider includes the URL address of said content provider.

36. A system of claim 21, wherein said clustering comprises a generator for generating a feature set to represent the language used in said content.

37. A learning system for targeting an offer over a computer network, comprising: a content database for storing content of a plurality of content providers; a display database for storing a plurality of displays; a mapping engine for automatically mapping the displays into a plurality of content clusters, based on the co-occurrence of terms between the content in the content database and the display database; an experiment engine for determining the popularity of a first display mapped to at least one content cluster relative to a second display not mapped to the content cluster; and a learning engine for modifying the mapping determined by the mapping engine based on the outcome of the experiment engine.

38. A system of claim 37, wherein the mapping engine is capable of establishing a mapping based on at least one of a time period and a trend.

39. A system of claim 38, wherein the time period is at least one of a time of day, time of year, time of month, or time of week.

40. A system for delivering advertising content, comprising: an ad building engine for building content capable of delivery over a computer network; an ad placement engine for placing ad content on a web page in proximity to other content; and an ad targeting engine for targeting advertisements based on a self- organizing map of content clusters.

41. A system of claim 40, further comprising: a learning system based on instance-based learning for retargeting ads in real time based on their relative success.

42. A deployment workbench for a targeting advertisements, comprising: a computer having a display; and a display on the computer, comprising a two-dimensional depiction of a self- organizing map, wherein the self-organizing map: depicts content clusters associated with content; depicts display clusters associated with advertisements; and is capable of graphically depicting a mapping of the content clusters to the display clusters.

43. A workbench of claim 42, wherein the user may alter the mapping by interacting with the display.

44. A workbench of claim 42, wherein the display is capable of changing in response to alterations in the mapping.

45. A workbench of claim 42, wherein the display changes in response to learning- based changes in the mapping.

46. A workbench of claim 42, further comprising: an experiment generator for generating a revised mapping for the display based on the relative success of an offer.

47. An online, symbolic, instance-based learning system, comprising: providing an analyzer for analyzing a mapping of a symbolic data set to another symbolic data set; and providing an experiment generator for generating an alternative mapping of the data sets.

48. A learning system of claim 47, further comprising: refining the mapping based on the results of an experiment generated by the experiment generator.

49. A method of claim 47, further comprising: obtaining the mapping using self-organizing mapping of the data sets.

50. A method of claim 48, further comprising: establishing an automated cycle of learning and refinement of the mapping.

51. A method of predicting an event, comprising: obtaining a first data set capable of symbolic manipulation; obtaining a second data set capable of symbolic manipulation; mapping the data sets into self-organizing maps; establishing an online learning engine for generating experiments as to the mapping of the data sets and refining the mapping based on the results of the experiments; and predicting the event based on a current mapping of the data sets.